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In  this  report,  we  present  the  design  details  of  four  components  of  the 
DBC.  These  four  components  form  the  structure  loop  of  the  DBC.  They  are  the 
keyword  transformation  unit  (KXU) , the  structure  memory  (SM) , the  structure 
memory  Information  processor  (SMTP)  and  the  Index  translation  unit  <IXU) . 

The  KXU  converts  keywords  Into  their  Internal  representations.  The  need  for 
and  the  Implications  of  Internal  representations  are  discussed  first.  Data 
structures  maintained  and  algorithms  executed  by  the  KXU  are  then  presented. 
Finally,  a hardware  organization  to  realize  the  KXU  Is  proposed. 

The  primary  function  of  the  SM  Is  to  retrieve  and  update  structural  7 
Information  of  the  database.  This  Information  Is  likely  to  be  large  (10  -10^ 
bytes).  Furthermore,  the  operations  on  this  Information  must  be  performed  at 
a rate  commensurate  with  that  of  database  operations  performed  by  the  mass 
memory  (MM).  In  this  report,  the  concept  of  a partitioned  content  addressable 
memory  (PCAM)  Is  used  to  Implement  the  SM  with  the  above  properties.  Powerful 
PCAM  organizations  are  possible  using  emerging  technologies.  To  this  end, 
three  design  alternatives  using  different  technologies  are  examined.  The 
, three  technologies  are  magnetic  bubble  memories,  charge-coupled  devices  (CCDs) 
and  electron  beam  addressable  memories  (EBAMs).  Algorithms  which  manipulate 
the  storage  systems  and  auxiliary  data  structures  maintained  by  the  processing 
elements  are  also  given.  In  order  to  enhance  performance  of  the  SM  during 
heavy  updates,  a look-aside  buffer  memory  Is  proposed. 

The  SMTP  Is  responsible  for  performing  set  Intersections  on  structural 
Information  retrieved  by  the  SM.  The  concept  of  PCAMs  is  once  again 
utilized  to  perform  rapid  Intersection.  The  IXU  is  Intended  to  decode  the 
structural  Information  output  by  the  SMIP. 

The  four  components  are  designed  to  operate  concurrently.  Keywords  are 
sent  to  the  KXU  at  regular  Intervals  by  the  DBC's  command  and  control 
processor  (DBCCP).  The  output  of  KXU  Is  sent  to  the  SM  which  retrieves 
Index  terms  for  the  trsnsformed  keyword  predicates  and  sends  them  to  the 
SMIP.  The  SMIP  output  Is  Interpreted  by  the  IXU  and  sent  to  the  DBCCP. 

This  pipeline  of  processors  results  In  msxlmum  utilization  of  the  hardware. 

The  remaining  components  of  the  DBC  form  the  data  loop  and  are  described 
In  Part  III. 
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1.  INTRODUCTION 

1.1  Background 

The  database  computer  (DBC)  Is  a specialized  back-end  computer  which  Is 

9 10 

capable  of  managing  data  of  10  -10  bytes  In  size  and  supporting  known  data 
models  such  as  relational,  network,  hierarchical  and  attribute-based  models. 

In  addition  to  Its  Intended  purpose  of  handling  large  databases  and  interfacing 
with  various  data  models,  the  DBC  Is  one  of  the  first  database  machines  which 
have  built-in  protection  mechanisms  for  access  control  and  clustering  mechanisms 
for  performance  enhancement. 

We  Intend  to  Issue  a series  of  reports  on  tV.>  DBC.  The  first  one  on  the 
concepts  and  capabilities  appears  In  [3].  Two  reports  on  the  design  of  the 
DBC  are  scheduled.  Additional  reports  will  be  on  the  use  of  the  DBC  In 
supporting  various  data  models.  This  report  represents  the  first  of  the  two 
design  documents.  Although  this  report  Is  self-contained,  the  reader  may  wish 
to  refer  to  Part  I [3]  for  motivation  and  clarification  of  concepts  and 
definitions. 

1.2  Scope 

In  this  report,  we  present  the  design  details  of  four  components  of  the 
DBC.  These  four  components  form  the  structure  loop  of  the  DBC  (see  Figure  1). 
They  are  the  keyword  transformation  unit  (KXU) , the  structure  memory  (SM),  the 
structure  memory  Information  processor  (SMIP)  and  the  Index  translation  unit 
(IXU).  The  KXU,  discussed  In  Section  2,  converts  keywords  (sent  by  the  DBCCP) 
into  their  internal  representations.  The  need  for  and  the  Implications  of 
Internal  representations  are  discussed  first.  Data  structures  maintained  and 
algorithms  executed  by  the  KXU  are  then  presented.  Finally,  a hardware 
organization  to  realize  the  KXU  la  proposed. 

The  primary  function  of  the  SM  la  to  retrieve  and  update  structural 
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Information  of  Che  database.  This  Information  Is  likely  to  be  large  (10  -10 
byCaa).  Furthermore,  the  operations  on  this  Information  must  be  performed  at 
a rate  coamicnsuracc  with  that  of  database  operations  performed  by  the  mass 
meiaory  (MM).  In  Section  3,  Che  concept  of  a partitioned  content  addreaaable 
memory  (PCAM)  [3]  is  used  to  Implement  the  storage  system  of  the  SM  with  the 
above  properties.  Powerful  PCAM  organisations  are  possible  using  emerging 
technologies.  To  this  end,  three  design  alternatives  using  three  different 
technologies  are  exsmimad.  The  three  teehmologlea  are  magnetic  bubble 
memories,  charge-coupled  devices  (CCDs)  and  electron  beam  addressable  memories 
(EBAMs).  Algorithms  which  manipulate  Che  it.  -sge  syetems  and  auxiliary  data 
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scnictures  maintained  by  the  processing  elements  are  also  given.  In  order 
to  enhance  performance  of  the  SN  during  heavy  updates,  a look-aside  buffer 
memory  is  proposed. 

The  SHIP  is  responsible  for  performing  set  intersections  on  structural 
information  retrieved  by  the  SM.  In  Section  4,  the  concept  of  PCAMs  Is  once 
again  utilized  to  perform  rapid  Intersection.  The  IXU  Is  treated  in  the  fifth 
section  and  Is  Intended  to  decode  the  structural  Information  o<  tput  by  the 
SHIP. 

The  four  components  are  designed  to  operate  concurrently.  Keywords  are 
sent  to  the  KXU  at  regular  intervals  by  the  DBCCP.  The  output  of  the  KXU  is 
sent  to  the  SM  which  retrieves  index  terms  [3]  for  the  transformed  keyword 
predicates  and  sends  them  to  the  SMIP.  The  SHIP  output  Is  Interpreted  by  the 
IXU  and  sent  to  the  DBCCP.  This  pipeline  of  processors  results  in  maximum 
utilization  of  the  hardware. 

The  remaining  components  of  the  DBC  form  the  data  loop  (see  Figure  1) 
and  are  described  In  Part  III  [8]. 
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2.  THE  KEYWORD  TRANSFORMATION  UNIT  (KXU) 

The  keyvord  transformation  unit  (KXU)  Is  Intended  for  the  performance 
enhancement  of  the  structure  memory  (SM)  In  the  search,  storage,  and  processing 
of  directory  entries.  The  manner  in  which  keywords  are  represented  In  the  SM 
plays  a pivotal  role  In  the  efficient  execution  of  these  tasks.  Thus,  It  Is 
important  to  examine  the  Issues  Involved  In  the  Internal  representation  of 
keywords  and  the  mechanism  employed  to  carry  out  the  transformation  of  external 
keywords  into  their  Internal  representation.  In  this  section,  we  propose  a 
hardware  organization  to  realize  keyword  transformation. 

2 . 1 Keyword  Transformation 

The  most  frequently  used  SM  operation  is  the  search  command  which  retrieves 
directory  entries  of  those  keywords  which  satisfy  the  predicate  associated  with 
the  search  command.  One  of  the  purposes  of  transforming  keywords  is  to  enable 
the  SM  to  readily  identify  the  sectors  of  the  partitioned  content  addressable 
memory  (PCAM)  to  be  searched  for  a given  keyword.  (Recall  from  [3]  that  the 
SM  Is  Implemented  using  a PCAM) . Hashing  as  a search  technique  accomplishes 
this.  The  basic  strategy  of  any  hashing  technique  is  to  partition  the  search 
space  Into  mutually  exclusive  sets  called  buckets.  Buckets  usually  comprise  one 
or  more  sectors  of  the  PCAM.  Such  an  arrangement,  then,  allows  the  SM  to  search 
for  the  directory  entry  of  a keyword  within  the  confines  of  a few  sectors  Instead 
of  the  entire  PCAM. 

Another  reason  for  transforming  keywords  Is  related  to  efficient  use  of 
PCAM  storage.  Keywords  used  by  the  database  user  are  not  constrained  to  be  of 
fixed  length.  Variable-length  keywords  will,  in  general,  require  larger  storage 
space  than  fixed-length  ones,  since  a variable-length  keyword  will  require  a 
field  to  Indicate  the  length  of  the  keyword  in  addition  to  the  keyword  Itself. 
Thus,  It  Is  advantageous  to  store  keywords  In  their  encoded  form  of  fixed  length. 

A third  reason  Is  related  to  the  efficient  processing  of  Information  which 
Is  known  a priori  to  be  of  fixed  length.  Since  variable-length  keywords  are 
internally  represented  as  fixed-length  fields,  keyword  comparisons  can  be  made 
at  a faster  rate  In  the  SM. 

From  the  discussion,  we  conclude  that  the  keyword  transformation  Involves 
the  encoding  of  variable-length  keywords  into  fixed-length  form  by  hashing. 

How  does  the  use  of  hash  encoded  directory  entries  affect  the  operation  of 
the  DBC7  In  a directory  where  keywords  are  stored  In  their  original  form,  all 
directory  entries  are  distinct.  However,  this  Is  not  guaranteed  to  be  the  case 
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when  transformed  keywords  are  used,  since  two  keywords  could  be  transformed  into 
the  same  hash  code  (l.e.,  there  is  a collision).  We  will  call  this  structural 
ambiguity.  In  order  to  determine  the  severity  of  structural  ambiguity  existing 
in  the  D6C,  we  shall  consider  the  probability  p of  one  or  more  keywords  being 
transformed  into  the  same  hash  code  (i.e.,  the  probability  of  one  or  more 
collisions) . 

p-  l-[n(n-l)...(n-(N-l))]/n^ 

where  N is  the  number  of  keywords  and  n is  the  address  space  determined  by  the 
length  L of  the  hash  code. 

The  inability  of  the  SM  to  distinguish  between  keywords  having  the  same 
internal  representation  may  result  in  MAU  accesses  which  do  not  have  any  record 
satisfying  a user  query.  (For  a given  MAU  address  f , the  MAU  access  is  defined 
to  be  the  process  of  locating  and  searching  the  f-th  MAU  by  the  mass  memory  (MM) . 
Note  that  this  does  not  imply  that  records  in  such  MAUs  will  be  passed  on  to  the 
user).  We  shall  call  this  problem  arising  from  structural  ambiguity,  access 
imprecision.  How  often  does  the  problem  of  access  imprecision  take  place  in  the 
system?  To  answer  this  question  we  need  to  compute  the  probability  p*  that  a 
partition  of  the  MM  will  be  searched  for  a query  Q even  though  all  of  the  keywords 
in  the  query  are  not  in  the  keyword  basis  of  the  partition.  Since  a partition  of 
the  MM  is  characterized  by  a triple  (MAU  address  f,  cluster  number  c,  security 
atom  name  s) , the  keyword  basis  of  the  partition  is 

KB(f,c,s)  - {Kl3Re  (f,c,s)  ? KeR}. 

Let  the  cardinality  of  KB(f,c,s)  be  m.  Further,  let  us  assume  that  the  query  Q 
is  a conjunct  of  n equality  keyword  predicates  and  that  r out  of  n of  these  key- 
words also  appear  in  KB(f,c,s)  for  some  f,  c,  and  s.  Then  we  can  show  that 


P'  < (-^). 


n-r 


When  all  keywords  of  Q except  one  appear  in  KB(f,c,s)  (l.e.,  n-r  - 1),  then 


nm 

.L 


Under  normal  circumstances,  nm  <<  2^,  so  p*  will  be  quite  small.  For 


-12 


for  nm 


example,  for  nm  - 1,000  and  L - 48,  p < 4 x 10 
p'  < 2.5  X 10  and  for  nm  ■ 1,000  and  L ■ 24,  p'  < 6.2  x 10 


1,000  and  L - 32, 
-5 


and  for  nm  ■ 1,000  and  L«  24,  p'  < 6.2x  lO".  When  more  than 
one  keyword  of  Q is  not  in  KB(f,c,s),  then  the  value  of  p*  becomes  vanishingly 
small.  Furthermore,  we  observe  that  L can  always  be  chosen  to  make  p and  p*  as 
small  as  we  please.  Thus,  we  conclude  that  the  problem  of  access  imprecision 
due  to  structural  ambiguity  will  not  affect  the  performance  of  the  DBC  in  any 
significant  manner. 
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2.2  Functional  Specialization 

Since  every  keyword  has  to  undergo  transformation  before  the  SM  Is  Interrogated* 
one  might  argue  for  the  Integration  of  the  transformation  logic  with  that  of  the  SM 
Itself.  However,  there  are  strong  reasons  for  Identifying  the  transformation  logic 
as  a separate  functional  unit. 

First,  the  operation  of  the  SM  is  functionally  different  from  that  of  the  KXU. 
Searching  on  fixed-length  fields  Is  the  single  most  Important  activity  taking  place 
In  the  SM.  In  hardware  terms.  It  Involves  efficient  comparison  operations.  The 
KXU,  on  the  other  hand,  manipulates  variable-length  fields,  and  Includes  shifting 
and  rotating  of  the  characters  of  keywords,  arithmetic  operations  on  (arrays  of) 
characters,  and  Indexing  for  loop  control.  The  KXU  should,  therefore,  be  equipped 
with  general-purpose  and  Indexing  registers  on  which  the  above  type  of  operations 
can  be  carried  out. 

Second,  the  SM  and  the  KXU  are  meant  to  operate  In  parallel.  That  Is,  as 
soon  as  a keyword  has  been  transformed,  the  KXU  Is  ready  to  process  the  next  key- 
word sent  by  the  DBCCP.  Meanwhile,  the  SM  is  retrieving,  deleting,  or  Inserting 
the  Index  term(s)  for  the  transformed  keyword.  Such  parallelism  Is  difficult  to 
achieve  If  the  SM  Is  required  to  perform  both  keyword  transformation  and  index 
term  retrieval  or  update. 

Third,  modularity  is  achieved  by  clearly  Identifying  and  separating  the 
functions  of  the  KXU  and  the  SM.  It  also  lends  itself  to  clear  and  simple  inter- 
faces between  the  components  Involved. 

Finally,  the  Incorporation  of  reliability  features  Is  facilitated  by  considering 
the  reliability  requirements  of  each  of  the  components  separately.  The  SM  is  mainly 
a storage  system  requiring  some  form  of  redundancy  coding  to  enhance  reliability, 
while  the  KXU  Is  essentially  a computing  device  requiring  redundancy  in  the 
processor  to  enhance  reliability  (e.g.,  triple  modular  redundancy  [12]  ). 

2.3  The  Logical  Design  of  ^,he  tJCU 

Good  keyword  transformation  algorithms  are  highly  dependent  on  the  nature  of 
the  keywords  and  their  exp,ected  use.  It  Is  unreasonable  to  expect  a single  trans- 
formation algorithm  to  be  equally  applicable  to  all  keywords  because  keywords  of 
different  files  could  have  vridely  varying  properties  and  could. appear  In  different 
types  of  queries.  For  example,  one  file  could  be  queried  on  the  basis  of  simple 
conjunct  of  equality  predicates,  whereas  the  queries  for  another  file  might  consist 
of  the  less-than  predicates.  Further,  we  would  require  that  directory  entries  of 
keywords  belonging  to  different  files,  be  stored  in  different  parts  of  the  SM. 

Since  storage  In  the  SM  Is  based  on  the  buckets  associated  with  keywords,  (see 


Section  3),  the  KXU  must  ensure  that  keywords  of  different  flies  are  associated 
with  different  buckets.  The  above  discussions  lead  us  to  the  following  data 
structures  and  processor  logic. 

2.3.1  Data  Formats  and  Structures 

Each  file  that  Is  known  to  the  system  has  a unique  identification  number  (ID). 
The  DBCCP,  in  all  Its  requests  to  the  KXU,  sends  two  pieces  of  information  - the  ID 
of  the  file  to  which  the  keyword  belongs  and  the  keyword  proper.  This  Is  depicted  in 
Figure  2.  The  keyword,  in  turn  has  two  parts;  an  attribute  Identifier  and  a value 
field.  This  is  shown  in  Figure  3.  The  attribute  identifier  identifies  the  attribute 
of  the  keyword  uniquely,  not  only  among  distinct  attributes  within  the  same  file  but 
also  among  identical  attributes  of  different  files. 

A transformed  keyword  T(K)  consists  of  two  parts;  a 24-blt  logical  bucket 
name  and  a 24-bit  encoded  representation  of  the  keyword  value.  As  shown  in  Figure  4, 
the  logical  bucket  name  consists  of  two  parts;  a 16-bit  attribute  id  (which  is 
Identical  to  the  one  supplied  by  the  DBCCP),  and  an  8-bit  partition  number.  The 
256  partitions,  that  can  be  Identified  by  the  8-blt  partition  field  are  used  for 
categorizing  keywords  of  an  attribute  according  to  their  values.  For  example,  if 
the  value  of  an  attribute  'salary'  ranges  from  say  $5,000  to  $100,000,  then  we 
might  partition  this  range  into  256  equal  subranges.  Thus,  given  a keyword,  say, 
salary  ■ 50,000,  we  can  easily  determine  its  partition  number  as  follows: 

Partition  Number  I 

[(100,000  - 5,000)/25^ 

In  case  the  attribute  of  a keyword  has  a non-numeric  value  (i.e.,  alphanumeric 
value),  partitioning  of  the  values  is  done  on  the  basis  of  the  encoded  order  of 
characters.  One  of  the  nice  properties  of  the  partitioning  scheme  is  that  it 
preserves  natural  ordering  among  keywords  of  an  attribute.  This  leads  us  to 
observe  that  partitions  are  created  to  facilitate  inequality  predicates  so  that 
they  can  be  processed  in  a straightforward  way  by  the  SM. 

The  remaining  24  bits  of  the  transformed  keyword  are  obtained  by  applying 
a hashing  algorithm  to  the  keywrd  value.  Each  active  file  in  the  system  (i.e.,  a 
file  that  Is  currently  being  accessed)  is  allowed  to  have  a maximum  of  four  different 


Keyword  Proper 


variable 


length 


Figure  2.  Input  from  DBCCP 
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Figure  4.  The  Parta  of  the  Transformed  Keyword  T(K) 
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this  Information  is  accessed  and  stored  within  the  table.  The  AIT  has  two 
parts:  the  access  vector  of  256  entries  and  a variable  number  of  Informa- 
tion blocks  called  AIT  blocks.  Given  the  attribute  Identifier,  the  access 
vector  Is  used  to  access  In  a rapid  fashion  the  information  block  of  the 
attribute.  Each  24-byte  AIT  block  contains  the  following  Information: 

(a)  pointer  to  the  next  AIT  block  belonging  to  an 
attribute  with  same  8 low  order  bits, 

(b)  address  of  hash  algorithm, 

(c)  8 high  order  bits  of  attribute  identifier, 

(d)  value  type;  00  for  fixed  point  number,  01 
for  short  floating  point  number,  10  for  long 
floating  point  number,  and  11  for  alphanumeric 
values, 

(e)  maximum  value  of  attribute, 

(f)  minimum  value  of  attribute  and, 

(g)  number  of  partitions  desired  ( < 256) . 

The  AIT  blocks  are  allocated  dynamically  as  required. 

Another  Important  data  structure  maintained  by  the  KXU  Is  the 
hash  algorithm  library  (HAL).  This  library  contains  the  hash  algorithms 
of  all  the  active  files  In  the  DBG.  The  HAL  Is  organized  Into  blocks  of 
400  bytes.  No  hash  algorithm  Is  expected  to  be  longer  than  400  bytes. 

(This  la  a design  decision  which  can  be  changed  without  affecting  the 
logic  or  performance  of  the  DBG).  Thus,  each  hash  algorithm  Is  stored  com- 
pletely in  one  block.  HAL  memory  Is  allocated  (and  freed)  one  block  at  a 
time.  The  hash  algorithm  addresses  In  AIT  blocks  (see  Figure  5)  point  to 
blocks  In  the  HAL  memory. 

2.3.2  Processer  Logic 

The  KXU  operates  In  two  distinct  modes.  These  modes  are  known  as  the 
load  mode  and  the  translate  mode.  By  means  of  control  signals  the  DBGGP 
can  sat  the  KXU  to  the  desired  mode  of  operation.  In  the  load  mode,  the 
KXU  accepts  Information  regarding  a file.  Such  Information  Is  used  by  the 
KXU  to  build  the  AIT.  In  the  translate  mode,  the  KXU  performs  keyword  trans- 
formations. Algorithms  prasantad  hare  will  be  categorised  as  althar  load 
or  translate  type  depending  on  whether  they  perform  In  load  or  translate 
mode. 


-12- 


ALGORITKM  A (Load  Type)  - To  build  a set  of  hash  algorithms  for  a file. 

Input  Arguments  from  the  DBCCP:  Hash  algorithms  on  the  Input  data 

lines.  [Assume  k (^4)  algorithms  are  to  be  read] 


Step  1: 
Step  2: 
Step  3: 

Step  4: 
Step  5: 


Step  6: 


1 1. 

Read  the  1-th  hash  algorithm  Into  the  Input  buffer. 

Request  a block  in  the  HAL  memory.  (See  algorithm  D for 
HAL  allocations).  Let  the  address  by  HALi. 

Move  hash  algorithm  from  Input  buffer  Into  the  block  at  HAL  .. 
Set  AODR  [1]  to  HALi.  [ADDR  Is  an  array  of  four  address 
words  which  are  used  to  remember  the  addresses  of  the  hash 
algorithms  of  a file  within  HAL.  ADDR  Is  subsequently  used 
by  algorithm  B to  set  hash  address  pointers  In  the  AIT  blocks 
In  step  9]. 

1 (1  + 1).  If  1 < k,  then  go  to  step  2;  else,  terminate. 


ALGORITHM  B (Load  Type)  - To  create  a set  of  AIT  blocks  for  a file. 

Input  Arguments  from  DBCCP:  Attribute  Identifiers  and  related 

Information.  [Assume  n AITs  have  to  be  created.] 

Step  1 : 1 1. 

Step  2:  Read  the  1-th  attribute  Identifier  and  the  related  Information 

from  the  DBCCP  Into  the  Input  buffer. 

Step  3:  Request  an  AIT  block  by  using  Algorithm  C. 

Step  4:  Use  the  8 low  order  bits  of  the  attribute  Identifier  A^  to 

address  the  access  vector  and  let  CURRENTPTR  A^. 

Step  5:  Retrieve  the  AIT  pointer  pointed  to  be  CURRENTPTR.  Call 

it  NEXTPTR. 

Step  6:  ACCESSVECTOR  [CURRENTPTR]  -r  Address  of  new  AIT  block. 

Step  7:  Set  AIT  pointer  In  the  new  block  to  NEXTPTR. 

Step  8:  Move  Information  from  input  buffer  Into  AIT  block.  Using 

Information  in  ADDR  array  (see  step  5 In  Algorithm  A)  set 
hash  address  pointer  In  new  AIT  block. 

Step  9:  1 1 -*•  1. 

Step  10:  If  1 & n,  then  go  to  step  2;  else,  terminate. 


ALGORITHM  C (Load  Type)  - To  allocate  an  AIT  block.  (In  order  to  maintain 

AIT  blocks,  a bit  map  Is  used) 

Input  arguments  - None. 

Step  1:  Scan  the  bit  map  for  the  first  bit  which  Is  0.  If  no  bit  Is 

found,  signal  error  condition.  Turn  the  bit  on. 

Step  2:  Compute  the  AIT  block  address  corresponding  to  the  bit  found 

In  step  1.  Terminate. 

ALGORITHM  D (Load  Type)  - To  allocate  a HAL  block.  (In  order  to  maintain 
HAL  blocks,  a bit  map  Is  used). 

Input  Arguments:  None. 

Step  1:  Scan  the  bit  map  for  the  first  bit  which  is  0.  If  no  bit  la 

found,  signal  error  condition.  Turn  bit  on. 


Step  2:  Compute  the  HAL  block  address  corresponding  to  the  bit 

found  In  step  1.  Terminate. 

ALGORITHM  E (Load  Type)  - To  deallocate  a AIT/HAL  block. 

Input  Arguments:  None. 

Step  1:  Compute  the  bit  number  In  the  bit  map  corresponding  to 

the  block  to  be  deallocated. 

Step  2:  Turn  off  the  bit.  Terminate. 


ALGORITHM  F (Load  Type)  - To  delete  the  AITs  and  hash  algorithms  of  a file. 
Input  Arguments  (from  DBCCP) : m attribute  Identifiers. 


Step  1: 
Step  2: 

Step  3: 


Step  4: 


Step  5: 
Step  6: 


1+1. 

Use  the  8 low  order  bits  of  the  1-th  argument  attribute 
Identifier  as  Index  Into  the  access  vector. 

Follow  the  polr".er  in  the  access  vector  entry  until  a 
match  Is  obtained  for  the  8 high  order  bits  of  the  argument 
attribute  identifier. 

Release  the  HAL  block  pointed  to  by  the  hash  algorithm 
address  field  of  the  AIT  block  found  In  step  3.  (Use 
Algorithm  E) . 

Release  the  AIT  block  found  In  step  3 . (Use  Algorithm  E) . 
1+14-1:  If  1 1 m,  go  to  step  2;  else,  terminate. 


ALGORITHM  G (Translate  Type)  - To  transform  a given  keyword  proper  Into  Its 

encoded  form. 

Input  Arguments:  Keyword  supplied  by  the  DBCCP. 

Step  1:  Use  the  8 lower  order  bits  of  the  attribute  Identifier  In 

the  keyword  proper  to  index  Into  the  access  vector. 

Step  2:  Follow  the  pointer  In  the  access  vector  until  the  8 high 

order  bits  match  with  the  8 bits  stored  In  an  AIT  block. 

Step  3:  Retrieve  the  address  of  the  hash  algorithm  from  the  AIT 

block  and  load  the  hash  algorithm  from  the  HAL  Into  the 

control  store. 

Step  4:  Compute  partition  number  for  the  keyword  using  the 

partition  details  in  the  AIT  block  and  the  keyword  value. 

Step  5:  Invoke  the  hash  algorithm  in  the  control  store  with  the 

keyword  value  as  argument. 

Step  6:  Form  the  48-*blt  transformed  keyword  by  concatenating  16 

bits  of  attribute  identifier,  8 bits  of  partition  number 
computed  in  step  4,  and  24  bits  of  hash  code  computed  in 
step  5. 

Step  7:  Send  the  48  bit  transformad  keyword  to  the  SM  and  to  the 

DBCCP.  Terminate. 


2.4  The  Physical  Realisation  of  the  KXU. 

From  the  above  dleeueaion,  we  can  estimate  the  type  and  amount  of 
hardware  required  to  realise  the  KXU.  First,  we  need  enough  meamry  to 
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hold  the  AIT  and  HAL.  Second,  we  require  a mechanlam  Co  execute  the  seven 
processor  algorithms  and  the  hash  algorithms. 

2.4.1  Table  Nemorlea  In  the  KXU. 

Since  Che  AIT  la  frequently  accessed  (once  for  every  keyword  transformation), 
Che  performance  of  the  KXU  can  be  affected  by  the  access  time  to  the  AIT.  Further- 
more, Che  organization  of  AIT  requires  at  least  two  and  possibly  more  accesses 
before  Che  required  AIT  block  can  be  retrieved.  Thus  a random  access  memory  Is 
best  suited  for  implementing  the  AIT.  The  access  times  per  word  can  be  in  the 
range  of  0.5  to  1 M’Sec,  since  we  need  to  access  a small  number  (6)  of  words  ( - 24 
bytes)  at  a time.  We  must  now  determine  the  size  of  Che  AIT  memory.  Since  each 
attribute  requires  one  AIT  block  and  an  AIT  block  occupies  24  bytes.  It  Is  easy  Co 
compute  the  AIT  size.  If  the  file  characteristics  (l.e.,  the  number  of  attributes 
per  file)  and  Che  number  of  active  files  in  the  system  are  given.  In  Table  I,  we 
have  tabulated  the  AIT  sizes  in  terms  of  the  number  of  attributes  and  files.  We 
next  ask  ourselves  what  technology  we  should  use  to  implement  AIT.  There  are  a 
number  of  technologies  which  can  provide  the  performance  characteristics  that  we 
need,  but  we  should  choose  the  one  with  the  lowest  cost.  The  leading  contenders 
for  RAMs  with  access  times  of  0.5  - 1 ^sec  are  core  technology  and  MOS/LSI  tech- 
nologies. Core  Is  more  competitive  beyond  200  Kbytes  while  MOS/LSI  has  the  edge 
for  sizes  less  than  200  Kbytes.  This  Is  Illustrated  In  the  table  by  the  broken 
line. 

We  now  consider  the  hash  algorithm  program  library  (HAL)  memory.  In 
Tnhla  II,  we  have  tabulated  the  HAL  size  for  different  numbers  of  active 
files  In  Che  DBC  and  for  different  hash  algorithm  sizes.  (Four  hash  algorithms 
are  assumed  for  every  active  file.)  The  address  of  a hash  algorithm  within 
the  HAL  to  be  used  for  a keyword  Is  determined  by  looking  up  the  corresponding 
AIT  block.  Thus,  the  nuad>er  of  accesses  to  the  HAL  for  the  retrieval  of  a 
hash  algorithm  Is  exactly  one.  This  Implies  that,  besides  random  access 
memory,  quasi-random  access  memories  or  even  sequential  access  memories  might 
be  used  for  storing  the  HAL.  Although  random  acceaa  meiaorles  can  be  used  to 
Implnmant  the  HAL,  we  can  look  for  cheaper  alternatives  In  fixed-head  disks/ 
drums  and  sequential  CCD  memories.  Access  time  for  fixed-head  disks  consists 
of  only  latency  tlMS  which  vary  from  5-10  msec.  Transfer  rates  can  be  as 
high  as  1 Mbyte  per  second.  Thus  the  total  time  to  retrieve  a hash  algorithm 
from  the  HAL  (Implemented  on  fixed-head  disks)  will  be  In  the  5-10  milliseconds 
range.  CCD  memories  seem  to  be  able  to  perform  better  for  almost  the  same  cost. 
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For  example,  clever  CCD  organization  [6,11]  can  lower  access  times  to  the 
10-100  wsec  range.  The  transfer  rate  la  as  good  or  even  better  than  fixed- 
head  disks. 

The  final  choice  of  technology  for  HAL  depends  on  the  access  times 
required,  since  all  the  technologies  discussed  above  can  be  used  to  build 
HAL  memories  of  sizes  tabulated  In  Table  II.  The  access  time  to  a hash 
algorithm  Is  crucial  to  the  overall  operating  speed  of  the  KXU.  This  In 
turn  Is  tied  to  the  operating  speed  of  the  SM  and  other  components  In  the 
loop  (see  Fig.  1).  Thus,  we  have  with  us  a number  of  technology  alterna- 
tives with  which  to  "tune”  the  KXU  so  that  the  KXU  Is  never  the  performance 
bottleneck  of  the  structure  loop. 

2.4.2  Implementation  of  the  KXU  logic 

The  KXU  Is  proposed  to  be  Implemented  using  a mlcroprogrammable 
LSI  microprocessor.  The  processor's  microprogram  memory  has  two  parts, 
a static  store  (Implemented  with  a ROM)  and  a dynamic  store  (Implemented 
with  a writable  control  store).  The  static  storage  area  contains  the  seven 
algorithms  outlined  earlier  and  a small  control  program  to  Initiate  the 
algorithms  In  proper  sequence.  The  dynamic  area  contains  the  hash  algorithm 
to  be  executed  for  a given  keyword.  There  are  several  reasons  for  choosing 
a mlcroprogrammable  microprocessor.  First,  given  the  present  state  of  the 
art  of  the  semiconductor  technology.  It  Is  far  more  economical  to  build 
control  structures  using  array  logic  than  random  logic.  Second,  by  using 
bit-slice  microprocessor  (like  the  INTEL  3000  series),  decisions  about  data 
widths  can  be  postponed  until  a very  late  stage  In  the  design  cycle.  Third, 
the  slower  speed  geuerally  attributed  to  mlcroprogrammable  processors  does  not 
constitute  a performance  bottleneck  In  the  DBG,  since  the  SM  component  does 
not  need  to  accept  transformed  keywords  at  rates  faster  than  one  every  milli- 
second. Finally,  the  logic  of  the  KXU  Is  relatively  straightforward  so  that 
the  usual  problems  accompanying  dynamic  piicroprogrammlng  (like  protection  of 
programs  from  one  another)  arc  absent. 

In  Figure  6,  we  have  shown  the  organization  of  the  various  components 
that  constitute  the  KXU. 

2.5  Implications  of  the  KXU  Deslan 

In  the  previous  sections,  we  have  presented  a design  for  transforming 
the  keywords  of  a file  into  an  encoded  form  which  can  then  be  used  by  the  SM 
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to  store  and  retrieve  directory  entries.  We  now  show  that  the  KXU  design 
Indeed  meets  the  goals  established  In  the  opening  section. 

First,  the  transformation  uniquely  identifies  the  bucket  In  which  the 
directory  entry  of  the  keyword  is  to  be  stored  or  found.  The  bucket  name  In 
our  design  Is  derived  from  the  24  high  order  bits  of  the  encoding.  These  24 
bits  are  obtained  by  concatenating  16  bits  of  the  unique  attribute  number  and 
8 bits  of  the  partition  number.  Thus,  keywords  whose  attributes  are  the  same 
and  whose  value  is  In  a given  range  will  be  found  in  the  same  bucket.  Such 
an  arragement  ensures  that  Identical  keywords  of  different  files  or  different 
attributes  of  the  same  file  are  not  stored  In  the  same  bucket.  This  results 
In  minimizing  the  search  area  In  the  SM  and  collisions  within  a bucket. 

Second,  the  partition  number  Is  particularly  useful  when  Inequality  predicates 
have  to  be  processed.  For  example.  If  the  Index  terms  of  all  keywords  K which 
satisfy  the  predicate  K ^ 3000  are  to  be  retrieved,  and  If  the  partition  number 
for  3000  Is,  say,  n,  then  the  SM  need  only  to  search  logical  buckets  each  of 
whose  attribute  Identifies  K and  whose  partition  number  Is  greater  than  or  equal 
to  n.  Third,  the  transformation  obtained  In  the  design  Is  of  fixed  length  48 
bits  for  all  keywords.  Fourth,  the  hashing  algorithms  (to  be  designed  by  the 
users)  need  only  to  produce  good  results  In  a narrow  range  of  values  within 
a partition  of  the  total  range  of  values  of  an  attribute.  By  this  we  mean  that 
even  If  two  keywords  In  two  different  partitions  have  the  same  hash  value, 
the  transformation  will  still  be  different  for  the  two  keywords.  Thus,  the 
proposed  design  alleviates  the  pressure  on  the  user/database  administrator  to 
come  up  with  sophisticated  hash  algorithms. 

Finally,  we  make  a few  comments  on  reliability.  The  nature  of  the 
Information  held  In  the  KXU  Is  such  that  even  a total  loss  of  such  Information 
is  not  catastrophic.  The  AIT  and  the  HAL  are  user  supplied  and  thus  can  be 
re-constructed  by  requesting  a reload  from  the  PES.  Standard  parity  check 
circuits  may  be  employed  with  the  memory  systems  to  detect  random  errors  and 
to  request  reloads.  The  processor  logic  Is  LSI-based  and  should  require  no 
special  reliability  feature. 
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3.  THE  STRUCTURE  MEMORY  (SM) 

The  structure  memory  (SM)  Is  the  repository  of  the  directory  entries 
of  all  the  keywords  known  to  the  DBG.  Retrieval  operations  on  the  mass  memory 
(MM)  are  preceded  by  retrieval  operations  on  the  SM.  Updates  to  the  MM  are 
accompanied  by  updates  to  the  SM.  Therefore,  the  performance  requirements 
of  the  SM  are  dictated  by  a desire  to  ensure  that  this  component  does  not 
become  a bottleneck  in  the  access  path  to  the  database  residing  in  the  MM. 

In  meeting  such  performance  requirements,  we  have  proposed  alternate  designs  - 
designs  which  use  different  emerging  technologies.  We  have  adopted  this 
approach  because  it  is  not  clear  at  the  present  time  which,  if  any,  of  the 
emerging  technologies  will  untlmately  become  commercially  viable.  Another 
aspect  of  the  designs  presented  in  this  section  is  that  we  have  parameterized 
the  designs  to  a great  extent.  The  need  to  parameterize  was  prompted  by  our 
desire  to  offer  the  DBC  as  a viable  alternative  to  users  with  a wide  range  of 
requirements.  The  DBC  and  in  particular  the  SM  should  be  capable  of  being 
tuned  to  a particular  set  of  requirements  by  merely  choosing  the  right  set  of 
design  parameters. 

3.1  Performance  Requirements  and  Logical  Organization  of  the  SM 

We  begin  our  discussion  of  SM  design  by  describing  its  performance 
characteristics.  The  SM  must  have  sufficient  capacity  to  store  the  directory 
entries  for  a database  of  10®  to  10^®  bytes.  Typically,  directories  are  of 
the  order  of  l!tl  to  lOZ  of  the  database  [7].  Therefore,  the  SM  designs  should 
be  viable  for  capacities  of  10^  to  10®  bytes.  Furthermore,  the  SM  must  have 
sufficient  speed  to  process  queries  at  a rate  commensurate  with  the  speed  of 
the  MM.  With  good  clustering  strategy,  a query  will  usually  require  one  MAU 
access  by  the  MM.  Since  the  MM  is  Implemented  with  modified  movlng>head  disks 
and  requires  IS  to  25  milliseconds  to  access  a MAU,  the  SM  query  processing 
time  must  be  of  the  same  order.  Even  though  queries  will  vary  widely  In  the 
number  of  predicates  appearing  In  the  queries,  we  estimate  that  In  the  worst 
case  queries  will  seldom  have  more  than  15  to  25  keyword  predicates.  Therefore, 
the  SM  must  process  each  predicate  in  about  1 millisecond. 

In  order  to  meet  the  above  requirement,  the  SM  Is  organized  as  a 
partitioned  content  addressable  memory  (PCAM) . A partition  of  the  PCAM  is 
defined  to  be  the  set  of  all  of  the  directory  entries  that  are  accessed  as  a 
result  of  a single  search  order.  In  our  discussion  below,  we  have  used  the 
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term  logical  bucket  to  denote  a partition  of  the  PCAM,  and  term  bucket  memory 
to  denote  the  PCAM  Itself.  These  terms  are  historically  associated  with 
hashing;  here  they  serve  to  remind  the  reader  that  partitions  of  the  PCAM 
are  identified  by  transformation  of  search  keys. 

Since  transformed  keywords  are  identified  by  their  logical  bucket 
names  (see  Fig.  4),  the  search  for  the  directory  entries  of  a keyword  amounts 
to  a search  of  the  named  logical  bucket.  By  employing  a large  number  of 
small  logical  buckets,  we  can  reduce  the  amount  of  Information  to  be  searched 
In  each  one.  The  size  of  buckets,  however,  cannot  be  reduced  Indefinitely 
without  Incurring  a performance  degradation.  The  minimum  bucket  size  Is 
governed  by  the  technology  used  to  implement  the  SM. 

In  practice  we  cannot  guarantee  that  the  logical  buckets  will  be 
equally  filled.  Therefore,  buckets  cannot  be  of  fixed  size.  In  conventional 
dlsk-orlented  systems  this  problem  Is  solved  by  allowing  preallocated  fixed- 
size  buckets  to  spill  over  Into  shared  overflow  areas.  There  are  two  obvious 
drawbacks  to  this  approach:  First,  the  need  for  a shared  overflow  area 

Increases  the  amount  of  data  that  must  be  searched.  Second,  by  pre-allocatlng 
large  fixed-size  buckets,  space  wastage  Is  Inevitable.  To  avoid  these  problems 
we  need  to  have  truly  variable-size  buckets.  The  Implementation  of  variable- 
size  buckets  can  best  be  achieved  by  making  the  physical  block  size  substantially 
smaller  than  the  average  bucket  size. 

The  number  of  logical  buckets  is  determined  by  the  number  of  bits  that 
are  used  to  represent  the  bucket  names.  This,  in  turn.  Is  dependent  on  the 
attribute  name  length  and  the  partition  name  length.  However,  the  actual 
number  of  buckets  that  can  exist  at  any  given  Instant  Is  limited  by  the  number 
of  physical  blocks  chat  can  be  Independently  accessed  by  the  SM.  We  thus 
make  a distinction  between  logical  buckets  (created  by  the  KXU)  and  physical 
buckets  (that  can  actually  exist)  In  the  SM.  This  observation  Implies  that 
the  SM  must  employ  a mapping  device  to  maintain  the  relationship  between 
logical  bucket  names  and  physical  buckets. 

The  SM  consists  of  throe  logical  components:  the  bucket  mapping  unit, 

the  bucket  memory  and  the  look-aside  buffer  (see  Fig.  7).  The  bucket  mapping 
unit  Is  used  to  translate  logical  bucket  names  into  physical  bucket  names. 
Conceptually,  the  bucket  mapping  unit  contains  a set  of  couples  of  the  form 
(l,k)  where  1 Is  the  name  of  a logical  bucket  and  k Is  the  name  of  a physical 
bucket.  The  couple  Indicates  that  the  logical  bucket  Is  stored  In  physical 
bucket  k.  The  bucket  memory  maintains  a variable  number  of  variable-size 
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physlcal  buckets.  It  accepts  as  Input  a physical  bucket  name  and  an  access 
order,  and  operates  by  performing  the  Indicated  access  on  the  physical  bucket 
identified  by  the  input  physical  bucket  name.  The  look-aside  buffer  is  used 
to  buffer  update  operations  in  order  to  improve  performance. 

3.1.1  Storage  Considerations  of  Directory  Entries 
As  we  recall  from  [3],  a directory  entry  D(F,K)  consists  of  a set  of 
index  terms 

f » (^2*  ^2’  ’ ’ ■ " ’ ^^k’  *"k  * ®k^  ^ 

where  f is  an  MAU  address,  c is  a cluster  number,  and  s is  a security  atom 

name.  Since  a bucket  cannot  be  shared  by  two  or  more  files  and  since  file 

names  may  be  Identified  by  logical  bucket  names,  storage  formats  for  Index 

terms  need  not  carry  file  names.  Such  a format,  called  the  encoded  directory 

entry  format  is  shown  in  Figure  8.  To  estimate  the  length  of  the  storage 

format,  let  us  compute  the  length  of  an  index  term.  We  first  determine  the 

number  of  bits  required  to  specify  a MAU  address.  The  DBC  must  have  a capacity 
9 10 

of  10  -10  bytes.  A MAU  (in  case  of  a moving-head  disk  implementation) 

typically  has  a capacity  of  about  4 X 10^  bytes  (say,  20  tracks  per  cylinder, 

with  20  Kbytes  per  track).  This  means  we  need  about  2.50  X 10^  - 2.50  X 10^ 

9 10 

MAUs  to  hold  a database  of  capacity  10  -10  bytes.  To  address  the  higher 
limit  of  the  range  of  MAUs,  we  need  15  bits. 

To  estimate  the  number  of  clusters  that  have  to  be  represented  and 
hence  the  number  of  bits  required  to  uniquely  identify  a cluster  in  the  index 
term,  let  us  estimate,  in  the  worst  case,  that  a file  will  have  1000  clusters. 

If  the  DBC  is  designed  to  hold  in  the  neighborhood  of  1,000  files,  then  the 
total  number  of  clusters  required  to  be  represented  is  about  10^.  Thus  we  need 
20  bits  to  represent  a cluster  name  uniquely.  The  same  kind  of  argument  may  be 
advanced  to  determine  the  number  of  bits  required  to  uniquely  represent  a 
security  atom  name.  We  are  now  in  a position  to  determine  the  length  (in  bits) 
of  an  index  term.  Each  index  term  will  require  55  bits  (-  15  + 20  + 20).  Can 
we  do  better  than  this?  The  answer,  fortunately,  turns  out  to  be  in  the  affirma- 
tive. Let  us  see  how  we  can  achieve  a reduction  in  the  index  term  size.  First, 
recall  from  the  previous  section  on  the  design  of  KXU  that  directory  entries  of 
keywords  of  different  files  will  be  stored  in  and  retrieved  from  different 
logical  buckets.  Second,  each  query  formulated  by  the  user  applies  only  to 
records  within  a particular  file  (l.e.,  a query  cannot  refer  to  more  than  one 
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file).  Third,  no  file  is  expected  to  occupy  all  of  the  MAUs.  Fourth,  no 

file  is  expected  to  have  more  than  1000  clusters  or  security  atoms.  These 

four  observations  imply  a)  that  the  index  terms  stored  in  the  SM  need  not  be 
unique,  and  b)  the  lengths  (in  bits)  of  each  of  the  three  index  term  components 
can  be  less  than  those  required  to  represent  the  entire  ranges  of  these 
components.  Thus  if  a file  is  allocated  to  a maximum  of  n MAUs  and  has  a 
maximum  of  p clusters  and  q security  atoms,  then  an  index  term  need  only  occupy 

+ [log2P]  + [log2q]  bits.  The  format  of  such  an  index  term  is  shovm 

in  Figure  9.  In  the  next  section,  we  shall  propose  an  index  term  format  which 
reflects  much  of  the  above  discussion. 

Vhat  is  the  price  that  we  have  to  pay  in  order  to  achieve  this  reduction? 
Since  index  terms  are  no  longer  unique,  we  need  to  have  some  kind  of  file 
dictionary  which,  given  an  index  term,  will  produce  MAU  address,  cluster  number, 
and  security  atom  number  which  can  then  be  used  to  access  the  MM.  We  also  need 
allocation  and  release  mechanisms  for  MAD  identifiers,  cluster  identifiers 
and  security-atom  identifiers.  These  are  carried  out  by  a functionally  specialized 
component  called  the  index  translation  unit  (IXU)  which  is  the  subject  of  our 
discussion  in  a later  section. 

3.1.2  The  Role  of  the  Look-Aside  Buffer 

During  the  normal  operations  of  the  database,  the  retrieval  of 
information  from  the  SM  will  be  far  more  frequent  than  the  Insertion  (update) 
of  (existing)  information  in  the  SM.  However,  it  is  conceivable  that  during 
short  intervals  of  time,  a large  number  of  updates  may  have  to  be  carried  out. 

Such  an  occurrence  can  seriously  affect  the  average  information  retrieval  rate. 

The  use  of  a look-aside  buffer  is  aimed  at  alleviating  such  a degradation  in 
SM  performance.  Mien  an  update  or  Insert  coimnand  is  received  by  the  SM,  it 
is  placed  in  the  look-aside  buffer.  Execution  of  conmands  in  the  buffer  is 
delayed  until  one  of  the  following  two  conditions  is  met:  (a)  The  loading  of 

buffer  reaches  a certain  level,  called  the  threshold  value;  (b)  the  SM  has  no 
retrieve  command  awaiting  execution.  If  and  when  any  one  of  these  conditions 
is  mat,  the  processing  of  the  conmands  in  buffer  is  taken  up  on  a FIFO  baals. 

In  order  to  maintain  a FIFO  discipline,  coomtands  in  the  buffer  are  chained 
according  to  their  arrival  tlaaa. 

The  SM  monltora  the  load  (i.e.,  the  number  of  entries)  in  the  buffer 
in  the  following  manner.  When  a new  Inaert/delete  coanand  is  received,  it 
ia  placed  at  the  bottom  of  the  chain.  The  SM  then  checks  the  total  nuiaber  of 
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•ntrles  In  the  buffer.  If  it  is  equal  to  or  greater  than  threshold  value,  the  SMC 
proceeds  to  execute  all  the  commands  from  the  beginning  of  the  chain.  The 
condition  (b)  is  monitored  as  follows.  After  execution  of  a conmiand,  the  SM 
sends  a ready  signal  to  the  DBCCP.  The  SM  then  waits  for  a prespecified 
amount  of  time  (called  the  time-out  period)  for  another  command  from  the  DBCCP. 

If  the  time-out  period  expires  without  a command  being  issued  by  the  DBCCP, 
the  SM  proceeds  to  execute  the  first  command  in  the  buffer. 


3.2  The  Physical  Realization  of  the  SM 

The  three  logical  components  of  the  SM,  (the  bucket  mapping  unit,  the 
bucket  memory  and  the  look-aside  buffer)  are  realized  by  the  structure  memory 
controller  (SMC) , the  bucket  memory  system  (BMS)  and  the  look-aside  buffer 
memory  (LABM) . The  SMC  is  responsible  for  interfacing  with  the  DBCCP,  the 
KXU,  and  the  SMIP  (see  Fig.  1),  translating  DBCCP  commands,  euid  transforming 
logical  bucket  names  into  physical  bucket  names. 

We  begin  this  section  by  proposing  an  efficient  directory  entry  storage 
format  based  on  the  discussions  in  the  previous  section.  We  then  discuss 
three  design  alternatives  for  the  BMS.  Each  of  the  three  design  alternatives 
is  based  on  a different  technology.  The  first  design  utilizes  magnetic  bubble 
memory  systems;  the  second  is  based  on  charge-coupled  devices;  and  the  third 
uses  election  beam  addressable  memories.  All  three  technologies  are  emerging 
technologies  and  therein  lies  the  rationale  for  presenting  three  design  alter- 
natives. It  is  not  clear,  at  present,  which  of  the  three  will  ultimately 
develop  into  a commercial  product.  One  thing,  however,  is  clear:  it  will  be 

difficult  if  not  Impossible  to  implement  a reasonably  powerful  SM  at  a reason- 
able cost  without  the  emerging  technologies.  The  design  alternatives  are 
followed  by  a discussion  on  the  BMS  logic.  The  BMS  utilizes  an  array  of  pro- 
cessing elements  to  search  and  manipulate  the  contents  of  a bucket.  Finally, 
we  deal  with  the  implementation  of  the  SMC  and  the  LABM. 


3.2.1  Data  Structures  for  the  Bucket  Memory  System  (BMS) 

In  earlier  sections,  we  noted  that  directory  entries  of  keywords  which 
arc  stored  in  a particular  logical  bucket  (and,  hence,  in  a single  physical 
bucket)  have  Identical  logical  bucket  names.  Since  the  logical  bucket  name 
occupies  the  24  high  order  bits  in  a transformed  keyword,  it  follows  that 
all  directory  entries  in  a bucket  would  have  the  same  24  high  order  bits  in 
their  respective  keyword  field.  Therefore,  only  the  24  low  order  bits  of 
T(K)  would  actually  be  scored  in  the  storage  system.  This  is  shown  in  Figure 
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10.  [Recall  from  the  discussion  on  KXU,  that  the  24  low  order  bits  are 
produced  by  user  supplied  hash  algorithms,  and  hence  do  not  play  a role  In 
locating  the  directory  entry  of  the  keyword]. 

The  index  term's  three  components  have  the  following  lengths.  Eight 
bits  represent  an  MAU  Identifier,  while  two  ten-bit  fields  are  used  to 
Identify  the  cluster  and  the  security  atom.  This  makes  for  a total  of  28 
bits.  The  rationale  for  this  choice  Is  as  follows.  Consider  the  space 
requirements  of  a file.  The  minimum  unit  of  allocation  of  mass  memory  Is 
an  MAU.  Allocation  of  an  MAU  means  an  allocation  of  about  4 million  bits 
of  MM.  A maximum  allocation  of  256  MAUs  for  a file  provides  a total  capacity 
of  over  a billion  bits  which  we  feel  Is  adequate  for  most  large  files  encoun- 
tered In  practice.  Thus,  eight  bits  In  the  Index  term  can  uniquely  Identify 
any  of  the  256  possible  MAUs  allocated  to  a file.  As  we  shall  see  later,  the 
IXU  maintains  a translation  table  for  each  file  by  which  the  actual  MAU 
address  may  be  determined.  Also,  we  do  not  anticipate  files  with  more  than 
1,000  clusters  and  1,000  security  atoms.  Thus  the  10-blt  fields  are  seen  to 
be  adequate  to  represent  the  anticipated  cluster  and  security  atom  population 
In  a file. 

In  the  case  of  clustering/security  keywords,  the  logic  of  some  of  the 
DBCCP  algorithms  requires  that  the  number  (or  count)  of  clustering/security 
keywords  that  make  up  a cluster/security  atom  be  made  available  in  the  Index 
term.  Thus,  each  Index  term  of  a clustarlng/securlty  keyword  has  a count 
field  associated  with  It.  In  Fig.  11,  we  have  the  formats  of  the  various 
types  of  directory  entries. 

We  now  turn  to  the  consideration  of  sizes  of  typical  directory  entries 
with  a view  to  establishing  certain  mlnlmum-access-unlt  sizes  within  the 
bucket  memory.  Such  mlnimtsB-access  units  will  be  called  modules.  The  size 
of  the  directory  entry  of  a simple  keyword  shown  In  Figure  lib  varies  with 
the  nuiaber  of  Index  terms  In  the  entry.  Let  this  number  be  m for  simple 
keyword  K.  Then  the  number  of  bits  b(K)  required  for  storing  the  directory 
entry  of  a keyword  K Is 

b(K)  - 1t(K)|  16  + m • |l| 

where  |t(K)|  - 24  (low  order  bits  of  T(K)) 
and  |l|  *28  (site  of  Index  term) 

thus,  b(K)  - 40  + 28m. 

The  size  of  a security  or  clustering  keyword  directory  entry  will  be 
larger  than  what  we  have  indicated  for  a simple  keyword  directory  entry. 
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Figure  11a.  Format  of  an  Index  Term 
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Figure  lib.  Format  of  a Simple  Keyword  Directory  Entry 
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But  since  the  number  of  security/clustering  keywords  Is  small  compared  to 
simple  ones,  we  can  use  the  above  figure  for  our  purpose.  In  order  to 
minimize  the  access  times.  It  Is  desirable  that  a module  have  sufficient 
capacity  to  hold  the  entire  directory  entry.  Further,  It  would  be  desirable 
to  pack  several  directory  entries  in  a single  module  since  the  number  of 
modules  to  be  accessed  for  directory  entries  of  keywords  satisfying  a 
predicate  would  be  minimized.  On  the  other  hand,  we  %rould  like  to  retain 
the  flexibility  of  varying  the  size  of  the  bucket  (which  is  made  up  of  one 
or  more  modules)  as  finely  as  possible  since  we  have  advocated  small  physical 
blocks  (known  as  modules  here) . Finally,  we  should  bear  In  mind  that  module 
sizes  can  only  vary  between  (narrow)  limits  set  by  the  technology  used  to 
Implement  the  BMS. 

We  are  now  In  a position  to  state  the  requirements  of  the  BMS.  These 
are  as  follows: 

• 7 9 

* Total  capacity  Is  of  10  -10  bytes, 

* Typical  module  capacity  Is  In  the  range  1-8  Kbits, 

* Access  time  to  any  directory  entry  must  be  under 
1 msec;  and 

* It  should  be  highly  reliable. 

3.2.2  Technology  Based  Design  Alternatives  for  the  BMS 

A.  Magnetic-Bubble-Memory-Based  Bucket  Memory 

The  characteristics  of  bubble  memory  systems  are  summarized  below  [4J: 

* Bubble  memories  are  sequential  access  memories. 

* Shift  rates  at  present  are  in  the  region  of  100  KHz; 
potential  exists  for  higher  (up  to  500  KHz)  shift 
rates. 

* Bidirectional  shifting  is  possible;  bubbles  may  be 
stopped  and  started  without  Incurring  time  penalties. 

* At  present,  logic  may  not  be  integrated  with  memory. 

* Cost  is  less  than  .02c/blt. 

a.  Bubble  Memory  Organization  - With  the  above  characteristics  in  mind, 
we  have  chosen  the  major««inor  loop  organization  14]  as  the  basic  chip 
configuration.  This  is  shown  in  Figure  12.  In  this  organization,  bubbles 
are  organised  In  closed  loops  called  minor  loops.  The  nuaber  of  bits  U 
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per  minor  loop  and  Che  number  of  such  loops  N In  a chip  are  design  parameters 
which  we  shall  compute  later  In  this  section.  Bubbles  In  the  N minor  loops 
are  always  moved  In  synchronization.  The  presence  of  a bubble  in  a bit  posi- 
tion constitutes  a '1'  and  Che  absence  of  Che  bubble  constitutes  'O'.  Bubble 
chips  with  Che  above  configuration  are  organized  Into  'bit'  planes.  Each  bit- 
plane  consists  of  a set  of  M chips  mounted  on  a board.  A set  of  bit-planes, 
called  a bubble  file.  Is  shown  in  Figure  13. 

The  concept  of  a bit-plane  Is  borrowed  from  semiconductor  memory  tech- 
nology and  Is  primarily  intended  for  enhancing  reliability  of  the  system. 

We  shall  briefly  explain  how  this  Is  brought  about.  Processing  elements 
which  access  these  memories  do  so  In  words.  A word  may  comprise  of,  say,  P 
bits.  These  P bits  may  either  be  contained  entirely  within  a chip  or  be 
distributed  one  bit  per  chip  across  P separate  bit-planes  (as  shown  In  Figure 
13).  The  major  disadvantage  with  Che  former  type  of  storage  Is  that  If  the 
chip  containing  Che  word  malfunctions,  entire  words  may  become  Inaccessible. 

On  the  ocher  hand,  the  distribution  of  bits  of  a word  over  several  chips 
enables  us  to  recover  from  single-chip  failure  by  means  of  redundancy  coding 
(e.g.,  Hamnlng  code).  The  set  of  chips  across  bit-planes,  each  of  which 
carries  one  bit  of  several  words.  Is  called  a chip  pile  as  shown  In  Figure  14. 

In  summary,  bubble  memory  systems  are  organized  as  bubble  files.  A 
bubble  consists  of  P bit  planes  (where  P Is  Che  word  length  of  Che 

processor).  Each  plane  has  M chips;  thus  there  are  M x P chips  In  a bubble 
file.  Each  chip  has  N loops  with  W bits  per  loop.  Thus  a chip  has  a total 
of  N x W bits.  Therefore  total  capacity  of  a bubble  file  IsMxNxWxP  bits. 

b.  Access  Considerations  - As  seen  from  Figure  12,  there  are  two  time 
periods  Involved  In  reading  out  a single  bit  from  any  of  the  minor  loops. 

First  the  bit  (or  bubble^  ' ■>s  to  be  moved  to  the  major  loop  (from  its  position 
in  Che  minor  loop).  We  call  this  time  period  the  access  time.  Second, 

Che  bit  has  to  be  moved  along  the  major  loop  dll  It  reaches  the  read  station. 
This  time  period  Is  called  the  latency  time.  In  Che  worst  case, 

W N 

t ■ Time  to  read  a single  bit  + J 

Access  Latency 
Time  Time 

where  f is  the  shift  frequency.  At  the  present  state  of  technology,  Che  shift 
frequency  is  rather  low  (about  100  KHs)  and  thus  Che  total  time  to  read  a bit 
could  be  several  milliseconds  for  any  reasonable  value  of  W and  N. 


I 

I 

1 


♦ 


A chip  pile 
(see  Fig.  14 
for  0 blow-up 
of  0 chip  pile ) 


Figure  13.  A Bubble  File 


N Loops 


P bits 
constitute 
0 word 


W bits 


Figur*  14.  A Bubbls  Chip  PH*  Contslnlng  P Chips 
Each  Contsining  0ns  Bit  of  N x w Words 
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Thls  rules  out  Che  possibility  of  serial  readout  of  minor  loops.  Clearly, 
a high  degree  of  blc-lnter leaving  Is  Indicated.  Since  a chip  associated  with 
a chip  pile  can  be  activated  simultaneously,  we  obtain  a P-fold  Increase  In 
data  transfer.  Logically,  we  find  that  such  a scheme  will  provide  us  with 
one  word  Instead  of  one  bit  per  readout.  To  further  Increase  the  transfer 
rate,  we  can  activate  all  the  M chips  on  a bit  plane.  This  scheme  gives  us 
M words  (of  P bits  each)  per  readout. 

We  are  now  In  a position  to  define  a module  which  Is  the  smallest  unit 
of  access  and  determine  the  time  required  to  access  It. 

Definition  - A module  Is  the  set  of  words  which  can  be  retrieved  by  a 
complete  readout  of  the  major  loops  of  all  the  chips  In  a bubble  file. 

Since  the  contents  of  a major  loop  within  a chip  constitute  one  bit  posi- 
tion of  all  the  minor  loops  In  the  chip,  we  conclude  that  the  number  of 
modules  In  a bubble  file  is  equal  to  the  number  of  bit  positions  In  the 
minor  loop.  The  average  access  time  to  a module  Is  (W/2)/2f.  (Here  we 
take  advantage  of  bidirectional  shifting;  thus,  the  longest  time  for  moving 
a bit  from  the  minor  loop  into  the  major  loop  Is  W/2f.  On  an  average  this 
time  will  be  one  half  the  longest  time).  The  time  to  read  a module  Is  N/f. 

(N  Is  the  number  of  minor  loops  In  the  chip.  By  definition  a module  has 
one  bit  In  each  and  every  minor  loop  In  the  chip) . Thus,  the  average  time 
T„  to  access  and  read  a module  * (W  -f  N)  x f.  In  our  design  we  shall 
assume  f to  be  100  KHz.  Then, 


(W  + N) . 
10^  4 


Since  a keyword  directory  entry  Is  stored  In  a module,  and  since  we  require 
that  a keyword  entrv  be  retrieved  within  1 ms,  we  have  the  Inequality 

\ i 10-' 

X (H  + W)  < — 

10^ ^4  lO"* 

or  j (H  -f  W)  T~ioo~  1 

I ■ ^ , 

The  above  Inequality  holds  only  if  we  assume  that  a keyword  directory  entry 
la  found  In  the  first  nodule  that  la  accassad.  Since  more  than  one  module 
may  be  allocated  to  a bucket,  it  la  conceivable  that  savaral  modules  may  have 
to  be  acccoaed  before  the  directory  entry  la  located.  However,  if  the  number 
of  procasalng  alaasnts  is  sufficiently  large,  than  the  algorithms  presented  In 
a later  section  will  ensure  that  the  modules  allocated  to  a bucket  are  uniformly 


distributed  among  the  memory  units  of  several  processing  elements.  This  ob- 
servation Implies  that  It  will  be  more  often  the  case  that  a processing  element 
will  have  to  access  just  one  module  in  its  memory  unit  to  locate  a keyword 
directory  entry.  Under  this  condition,  the  Inequality  Is  useful  for  calcu- 
lating values  of  N and  U.  In  Table  III,  we  have  tabulated  typical  values 
of  N and  W satisfying  the  inequality  In  the  box. 

The  number  of  processors  needed  for  the  bucket  memory  depends  on  the 
total  capacity  of  the  bucket  memory,  the  module  access  time  and  processor 
capability.  Since  the  capability  of  a processor  Is  not  an  easily  determinable 
quantity,  we  merely  tabulate  In  Table  IV  the  memory  uilt  capacity  as  a function 
of  total  bucket  memory  capacity  and  number  of  processors.  In  Table  V,  we 
tabulate  the  number  of  chips  required  In  the  memory  unit  of  a processor  as  a 
function  of  the  chip  size  (from  Table  III)  and  memory  unit  capacity  (from 
Table  IV).  A knowledge  of  the  number  of  chips  per  memory  unit  will  enable 
us  to  determine  the  number  of  bubble  files  needed  and  the  number  of  chips 
In  each  of  the  bubble  files.  Let  P be  the  size  of  the  processor  word  and 
N be  the  number  of  chips  on  a bit-plane.  Then,  the  number  of  bubble  files 
Is  (total  number  of  chips  In  memory  unit)  / (M  x P).  P Is  fixed  when  the 
processor  Is  chosen.  The  number  M Is  decided  by  trial  and  error  within 
limits  set  by  packaging  technology.  For  example,  a typical  range  of  M would 
be  10-30  chips  on  a single  board.  In  this  design,  the  average  retrieval 
time  of  a module  Is  Independent  of  the  number  P of  bit  planes  or  the  number 
M of  chips  on  each  bit  plane.  The  size  of  a module,  however.  Is  dependent 
on  these  par£uneters  and  Is  M x N x P bits.  The  number  of  modules  In  a bubble 
file  Is  W,  the  number  of  bit  positions  (or  bubbles)  in  a minor  loop. 


c.  Design  Algorithm  - This  design  gives  a step-by-step  procedure  to 
determine  the  various  parameters  discussed  this  far. 

Input  Arguments:  1.  Size  of  SM  required 

2.  Number  of  processing  elements 

3.  Word  size  P of  processor 

4.  Size  of  module  desired  within  limits. 

Step  1;  From  the  size  of  SM  and  number  of  processor,  determine  the 
size  of  the  memory  unit  attached  to  each  of  the  processors 
by  using  Table  IV.  Call  this  S^,. 

Step  2:  From  the  size  of  the  module  desired  and  the  word  size  P 

of  the  processor,  determine  the  product  M x N where  M Is 
the  number  of  chips  on  a bit  plane  and  N Is  the  number  of 
minor  loops  In  the  chip,  since  the  module  size  Is  M x N x P. 
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Table  ni.  Typical  values  of  N aad  W which  satisfy  access  coustralnts. 
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Table  V.  Number  of  chips  as  a function  of  chip  size  and  memory  unit  capacity  (K  bytes). 


Memory  Unit 


Capacity 
in  Kbytes 

2944 

5376 

7296 

40 

109 

44 

80 

218 

120 

88 

160 

436 

240 

176 

320 

872 

480 

352 

400 

440 

800 

1200 

880 

1600 

2400 

1760 

3200 

8720 

4000 

8000 

16. 000 

32.  000 

35200 

Chip  size  (bits) 


8704 

9600 

— 

9984 

9856 

9216 

— 

8064 

6400 

4224 

1536 

37 

34 

33 

33 

35 

40 

50 

76 

209 

79 

68 

66 

66 

70 

80 

100 

152 

418 

148 

136 

132 

132 

140 

160 

200 

304 

836 

296 

272 

264 

264 

280 

320 

400 

608 

1672 

370 

340 

330 

330 

350 

400 

500 

760 

740 

680 

660 

660 

700 

800 

1000 

1520 

4180 

1480 

1360 

1320 

1320 

1400 

1600 

2000 

3040 

8360 

2960 

2720 

2640 

2640 

2700 

3200 

4000 

6080 

16720 

3700 

3400 

3300 

3300 

3500 

5000 

7600 

7400 

6800 

6600 

6600 

7000 

14800 

136000 

13200 

13200 

14000 

20000 

83600 

29600 

27200 

26400 

26400 

29600  27200  26400  26400  2 
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Step  3:  Using  S],^  and  choosing  available  chip  size,  determine  the 

number  of  chips  required  from  Table  V.  Call  it  Cjii. 

Step  4:  For  the  chip  size  used  In  Step  3,  look  up  Table  III  to 

determine  the  values  of  N and  W.  If  the  chip  size  Is  not 
found  in  Table  III,  then  go  back  to  Step  3 and  choose 
another  chip  size. 

Step  5:  Determine  M where  M " Module  Size  / (P  x N). 

Step  6:  Determine  number  of  chips  on  a bubble  file  BF^  when 
BF*  • M X P. 

Step  7:  Compute  the  number  of  bubble  files  as(*C^  / BFg]  . 


Design  Example  - We  now  give  an  example  using  the  above  design 
algorithm. 

g 

Input  Arguments:  1.  Structure  of  Memory  Size  - 10  bytes 

2.  Number  of  Processors  ■ 128 

3.  Word  Size  ” 16  bits 


Step  1: 
Step  2: 


4.  Module  Size  » 2 Kbytes. 

From  Table  IV  we  find  Sj^  * Size  of  memory  unit  " 800  Kbytes. 
Module  size  ” 2 Kbytes  ■ 2 x 8 x 1024  bits  - M x N x P. 
Therefore,  2 x 8 x 1024 


M X N 


- 1024. 


Step  3:  Choose  chip  size  “ 9856. 


Step  4: 
Step  5: 


Then,  from  Table  V,  we  have  “ 660. 

For  the  chip  alze  " 9856,  we  have  from  Table  III 
N - 36,  W - 224. 
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Step  5:  „ Module  Size  2 x 8 x 1024  1024 

” " P X N “ 16  X 36  ■ 36  " 

Step  6;  BFg  " Number  of  chips  per  bubble  file  * 29  x 16  ■ 464. 
Step  7:  Nusber  of  bubble  files  "TCm  f BFgl^pSbO  / 4641*  2. 


Summary  of  design  example 

Number  of  bubble  files  “ 2. 

Number  of  chips  on  each  bubble  file  “ 464. 

Capacity  of  bubble  file  ■ 9856  x 464  bits  ■ 571648  bytes  “ 570  Kbytes. 
Capacity  per  memory  wit  - 1140  Kbytes. 

Chip  details 

N “ number  of  minor  loops  " 36. 

W “ number  of  bits  per  loop  “ 234. 

Module  Size  ■ 29  x 36  x 16  * 16864  bits  ■ 2108  bytes. 

Nimiber  of  modules  per  btdtble  file  ■ 224. 

Number  of  modules  per  memory  tnlt  ■ 448. 

Number  of  modules  In  the  SM  ■ 448  x 128  ■ 57344. 
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e.  Reliability  consideration  - The  organization  outlined  In  the  above 
subsection  Is  well  suited  for  Incorporating  error  detecting  and  correcting 
codes.  The  Incorporation  of  one  such  code  will  be  briefly  described  here. 

In  Table  VI,  we  tabulate  the  number  of  error  code  bits  required  for  various 
word  sizes.  For  any  word  size  p,  we  add  k (error  code)  bit  planes  to  the 
p (data  storing)  bit  planes.  Input  and  output  error  code  generators  are  then 
added  to  identify  the  bit  plane  In  which  a fault  might  have  occurred.  On 
Identification,  the  operator  Is  alerted  to  replace  the  faulty  bit  plane  with 
a fault-free  plane  and  reload  the  new  bit  plane  with  the  helf  of  the  error  code 
generators.  We  observe,  that  the  detection,  correction  and  replacement  of 
malfunctioning  components  are  achieved  without  having  to  place  the  system 
off-line. 

The  need  for  such  reliability  measures  at  a cost  about  37. 5Z  of  a 
memory  system  with  16-blt  word  may  be  Justified  In  terms  of  the  HIBF  (mean 
time  between  failures)  of  the  system  where 

MTBF  ■ where  d Is  the  number  of  chips  and  x is 

the  failure  rate.  In  the  design  example,  we  had  a chip  count  of  about  1000 

per  memory  unit.  Since  we  had  128  memory  units  corresponding  to  128 

processing  elements,  the  total  chip  count  Is  about  128000.  If  we  assume 
a chip  failure  rate  of  O.IZ  per  1000  hours,  then  we  have  an  MTBF  of  about 
8 hours.  With  such  an  MTBF,  the  reliability  provision  looks  quite  desirable. 

B.  Charge-Coupled-Devlce-Basdd  Bucket  Memory 

Let  us  enumerate  the  main  characteristics  of  charge-coupled  devices 
(CCDs). 

* Charge-coupled  devices,  like  bubble  memories,  are 
sequential  access  memory  devices. 

* Shift  rates  vary  from  at  least  2 MHz  to  10  MHz. 

* Memory  needs  to  be  refreshed  periodically. 

* On-chip  logic  is  possible. 

* High  stand-by  power  Is  required. 

* Cost  per  bit  is  between  0.05  to  0.02c. 

Tha  technology  of  CCDs  is  in  fact  the  technology  of  semiconductor  devices, 
and  many  of  the  innovations  and  experiences  In  LSI  fabrication  are  applicable 
to  charge-coupled  devices.  In  addition,  bacausa  the  structure  of  CCDs  Is 
simpler  than  that  of  othar  LSI  products  [6],  one  should  expect  higher  chip 
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Table  VI . Slagle  Error  Correction  and  Double  Detection . [12] 


Error  Code  Bits 

k 

5 

6 

7 

8 

Data  word  size 

P 

4-10 

11-25 

26-56 

57-119 

Total  Bits 

n = p + k 

9-15 

17-31 

33-63 

65-127 

Redundancy  Ratio 

n/p 

2.25  - 1.5 

1.55  - 1.24 

1.27  - 1.13 

1.14  - 1.07 

(minimum  Hamming  Code  distance:  4) 


f 

t 


i 

» 

I 


t 


V 
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density.  Currently  announced  chips  have  a capacity  of  16K  bits,  while  a 
65  K-bit  prototype  has  been  built  [11].  It  is  estimated  that  the  128  K bit 
chip  is  a distinct  possibility  with  improved  fabrication  techniques  [6] . 

With  this  information  in  the  background,  we  begin  our  design  by  tabulating 
in  Table  VII  the  total  number  of  chips  required  in  the  bucket  memory  as  a 
function  of  chip  size  and  total  bucket  memory  size.  Ue  note  that  the  bucket 
memory  is  actually  divided  equally  among  n processors.  In  Table  IV,  we  have 
tabulated  the  memory  unit  size  as  a function  of  the  number  of  processors  and 
total  memory  size.  For  every  memory  unit  size  tabulated  therein,  we  tabulate 
in  Table  VIII  the  number  of  chips  needed  as  a function  of  the  chip  size. 

A number  of  chip  designs  have  been  suggested  [1,11].  In  choosing  a design 

to  suit  our  needs,  we  must  carefully  evaluate  the  tradeoffs  Involved  in  each  of 

9 

the  designs.  For  large  bucket  memories,  say  10  bytes,  we  must  use  large  size 
chips  < > 32K)  if  we  wish  to  keep  the  number  of  chips  down  to  reasonable  levels. 
This  is  Important  from  a reliability  standpoint,  since  the  higher  the  chip 
count,  the  lower  the  MTBF  for  a given  failure  rate.  Large  size  chips  imply 
high  densities.  Designs  which  feature  large  densities  do  so  by  reducing  the 
frequency  of  operation  in  order  to  keep  the  power  dissipation  of  the  chip  low 
and  by  reducing  the  number  of  refresh  stages,  I/O  stations  and  other  circuitry 
which  take  up  space  in  the  chip  at  the  expense  of  storage  elements.  A typical 
case  is  the  SPS  (serial-parallel-serial)  organization  which  can  support  the 
highest  densities  in  a chip.  However,  such  high  densities  are  achieved  at  the 
price  of  an  Increase  In  the-access  time  to  a bit  within  the  chip.  This  is  true 
of  the  SPS  organization.  For  example,  for  a chip  size  of  64  Kbits  and  a shift 
rate  of  lOMHz  the  average  access  time  is  3.2  msec.  Since  the  SPS  organization 

does  not  support  multiple  loops  which  can  be  individually  accessed,  it  is  not 

possible  to  shorten  the  access  times  by  clever  organization  of  modules.  We, 
therefore,  conclude  that  the  price  to  be  paid  for  realizing  large  buckets  In 
terms  of  higher  access  tiroes  is  not  acceptable. 

For  small  and  medium  size  bucket  memories  we  can  use  chips  of  sizes  8-16 
Kbits,  which  imply  that  packing  densities  can  be  considerably  lower.  In  Figure 
15,  we  present  the  so-called  LAKAM  (Line  Addressable  RAM)  organization  [1]  for 
implementing  small  and  msdlum  size  bucket  memories. 

a.  LARAM  organization  - This  organization  has  the  following  properties: 

1.  Random  access  to  any  CCH)  line  Is  possible. 

2.  Access  time  to  a line  Is  negligible  compared  to  read-out  time  of  the  line. 

3.  On-chlp  line  address  decoder  Is  available. 


I 
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Table  VII.  Number  of  chips  as  a function  of  chip  size  and  total 
E bucket  memory  size  (bytes). 


Chip 

Size  (bits) 

SM  Capacity 

o 

00 

4K 

20. 000 

200,  000 

8K 

10, 000 

100, 000 

16K 

50, 000 

32K 

2.  500 

25, 000 

64K 



1,250 

12.  500 

ne  Address  Decoder 
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4.  High  bit  rates  and  short  access  times  are  possible. 

5.  Fabrication  is  complicated  resulting  in  low  density. 

Chip  sizes  are  assumed  to  be  in  the  range  8-16K  bits.  As  In  the  case  of  bubble 
memory  design,  we  organize  chips  into  CCD  files.  Each  file  is  made  up  of  P (data) 
bit  planes  and  k (error  code)  bit  planes.  (Here  again,  P is  the  width  of  a 
processor  word  and  k is  determined  as  in  Table  VI) . Each  bit  plane  is  populated 
by  M chips  where  M is  one  of  the  design  parameters.  Unlike  bubble  memory  loops, 
however,  LARAM  lines  are  not  synchronized,  and,  thus,  the  concept  of  modules 
corresponding  to  bit  positions  (as  in  the  case  of  the  bubble  memory  system) 
does  not  hold  in  LARAM  design.  Instead,  the  contents  of  an  entire  line  will 
form  part  of  a module.  Assuming  a 2 MHz  basic  clock  rate,  the  time  required 
to  read  out  an  entire  line  of  various  sizes  is  given  in  Table  IX.  As  can  be 
observed  from  the  Table  IX,  even  for  a large  loop  (with  512  bits)  the  readout 
time  is  small  compared  to  bubble  memory  readout  times.  Let  the  line  size  ba 
W bits.  A module  is  defined  to  be  the  set  of  bits  contained  in  P lines  each, 
of  which  is  on  a bit  plane.  The  module  size  is,  therefore,  P x W.  If  there 
are  N lines  in  a chip  giving  a chip  capacity  of  N x W,  then  each  chip  in  a 
chip  pile  would  hold  one  bit  of  all  the  words  constituting  N modules.  A chip 
pile  is  defined  in  exactly  the  same  manner  as  in  the  bubble  memory  system. 

The  major  difference  between  the  CCD  chip  pile  and  the  bubble  chip  pile  is 
the  manner  in  which  the  modules  are  defined.  In  the  bubble  memory  design, 
there  were  W modules  distributed  in  M chip  piles.  In  the  LARAM-based  CCD 
system  there  are  N complete  modules  per  chip  pile.  Each  module  has  U words 
and  each  of  the  W words  Is  distributed  across  P bit  planes.  We  shall  call 
the  set  of  modules  defined  by  M CCD  chip  piles,  a CCD  file  as  depicted  in 
Fig.  16.  As  in  the  case  of  the  bubble  memory  system,  we  need  to  determine 
the  values  of  M and  W.  In  Table  X,  we  tabulate  the  number  W of  bits  per 
line  as  a function  of  the  number  of  lines  and  chip  size. 

A number  of  observations  can  be  made  at  this  point  as  a prelude  to  the 
design  algorithm.  First,  the  level  of  interleaving  Is  limited  to  that  required 
to  form  a single  word.  Recall  that  In  the  case  of  the  bubble  design  we  had 
a much  higher  level  of  Interleaving  because  of  the  slower  shift  rate  of  the 
bubb1«>s.  Second,  each  individual  line  In  a chip  can  be  addressed  to  the 
exclusion  of  other  lines  by  means  of  on-chip  address  decoding  circuitry. 

Again,  this  Is  in  contrast  to  the  bubble  design  where  minor  loops  could  not 
be  addressed  conceptually  In  three  steps.  In  stop  one,  we  select  the  CCD  file. 
Then,  the  chip  within  the  CCD  file  Is  selected.  In  step  three,  the  line 
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Table  IX.  Line  size  and  read-out  times  (2  MHz  clock). 


-51- 


constltuclng  a part  of  the  module  Is  selected.  In  Implementation,  the  module 
address  Is  broken  Into  three  components  - one  for  selecting  the  CCD  file,  one 
for  selecting  the  chip  within  the  CCD,  and  one  for  delecting  the  line  within  the 

V-'' 

chip.  Fourth,  since  the  data  Is  transferred  quite  rapidly  (a  word  every  0.5 
usee.),  It  may  not  be  possible  for  the  processor  to  process  the  Information  on- 
thc-fly.  It  may  then  become  necessary  to  buffer  the  data.  Fifth,  as  a result 
of  random  access  to  a line,  access  time  to  a module  Is  negligible  compared  to 
the  access  time  In  bubble  system. 


b.  Design  Algorithm 

Input  Arguments: 


1.  Bucket  Memory  Size 

2.  Module  Size 


3.  Word  size  of  processor 

4.  Number  of  processors 

Step  1:  From  bucket  memory  size  and  number  of  processor,  determine  size 

of  memory  unit  attached  to  each  of  the  processors  by  using  Table  IV. 

Step  2:  Choose  a chip  size  - 8K  or  16K.  Calculate  total  number  of  chips 

by  using  Table  VII. 

Step  3:  Use  the  word  size  and  module  size  to  determine  the  number  of  bits 

per  line  with  the  help  of  the  following  equation: 

Module  Size  ■ Number  of  bits  per  word  x number  of  bits  per  line 

Step  4:  Use  the  number  of  bits  per  line  determined  In  Step  3 and  the  chip 

size  determined  In  Step  2 to  determine  the  number  of  loops  per 
chip  from  Table  X. 

Step  5:  Determine  read-out  time  for  a module  from  Table  IX. 

Step  6:  Knowing  the  chip  size  (Step  2)  and  size  of  memory  attached  to  a 

processor,  determine  the  nundier  of  chips  In  the  memory  unit  by 
using  Table  VIII.  This  is  distributed  In  one  or  more  CCD  files. 

The  number  of  CCD  files  can  be  determined  by  iterative  calculation 
as  follows: 

Let  number  of  CCD  files  by  1,  then. 

Number  of  chips  per  bit  plane  - .g.f  chips  in  a n^i^ry  unit 

Word  size  x No.  of  CCD  files 
If  the  number  of  chips  per  bit  plane  Is  In  the  range  10-30,  go  to 
Step  7.  Otherwise,  Increase  the  number  of  CCD  files  by  one  and 
recalculate. 

Step  7:  Number  of  modules  - (Number  of  CCD  files)  x 

(Number  of  chips  on  a bit  plane)  x 
(Number  of  loops  per  chip). 


e 


Design  Exsmple 

Input:  Total  bucket  msmory  size  " 10  bytes 

Module  size  desired  " 8 Kbits 
Word  size  of  processor  " 16  bits 
# of  processors  ■128 

Step  It  Size  of  Memory  Unit  attached  to  s processor  ■ 800  Kbytes  from 
Table  IV. 
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Sc«p  2: 
Step  3: 


Chip  else  chosen  ■ 16  K bits. 

Total  number  of  chips  ■ 50,000  from  Table  VII. 

Module  Size  8 x 1024 


Number  of  bits  per  loop,  W 


Word  Size 


16 


512. 


Step  4:  From  Table  XI,  the  number  of  loops,  N,  Is  32. 

Step  5:  From  Table  IX,  read-out  time  of  a module  Is  2S6usec. 


Step  6:  Then,  the  number  of  chips  be  bit  plane  - ^ 25. 

10  X 1 X 1^0 

Step  7:  Number  of  modules  per  processor  > 1 x 25  x 32  ■ 800. 

Number  of  modules  In  bucket  memory  - 800  x 128  - 102,400. 

Stimmary  of  design  example 

Number  of  CCD  files  ” 1 

Number  of  chips  on  each  file  - 25  x 32  - 800 
Capacity  of  a CCD  file  - 25  x 16  x 16  » 6400  Kbits  > 800  Kbytes 
Capacity  per  memory  unit  •>  800  Kbytes 
Chip  details 

N ■ nunber  of  addressable  loops  - 32 
N ■>  number  of  bits  per  loop  > 512 
Module  Details 

Module  Size  ” 512  x 16  bits  “ 8 Kbits 
Number  of  modules  per  CCD  file  ■ 800 
Number  of  modules  per  memory  unit  ” 800 
Number  of  modules  in  BMS  " 800  x 128  - 102400 


d.  Reliability  Ccnslderatlons  - Much  of  the  discussion  on  reliability  features 
for  bubble  memories  are  applicable  here  also.  In  addition,  we  have  to  consider 
the  volatility  of  the  CCD  memories.  CCD  memories  need  refreshing  periodically. 
In  the  event  of  a power  outage,  standby  power  must  be  automatically  switched  on 
to  ensure  retention  of  bucket  memory  contents. 

C.  Electron-Beast-Addressable-Meskory-Bssed  Bucket  Memory 

The  characteristics  of  electron  beam  addressable  memories  (EBAM)  are  as 
follows  [9]: 

. Memory  bits  are  organized  as  static  locations  on  an 
electron  beam  sensitive  surface. 

. Memory  is  accessed  by  positioning  an  electron  beam 
over  the  required  locetlon  and  acannlng  the  area  with 
the  beam. 

. Aeceaa  to  any  location  la  at  worst  around  30|isec,  If 
there  la  a change  in  the  command  mode;  much  leaa  (lOveec), 
If  there  Is  no  change  In  the  commend  mode. 


. Data  transfer  rate  Is  around  10  MHz. 

. Each  "tube"  of  memory  is  conceptually  equivalent  to 
a bank  of  memory.  Typical  sizes  of  these  "tubes"  are 
30  million  bits. 

. Cost  Is  under  O.Olc/bit. 

Of  all  the  technologies  considered  so  far,  EBAKs  have  the  best  access  times 
and  seem  to  be  truly  applicable  to  large  buket  memories.  As  in  other  designs 
ve  will  consider  a range  of  tube  sizes  and  calculate  the  number  of  tubes 
needed.  Table  XI  tabulates  the  results  of  such  calculations.  Each  word  of 
a module  is  distributed  over  a set  of  tubes  just  as  in  the  case  of  the  other 
designs  discussed  thus  far.  The  surface  on  which  the  election  beam  is  allowed 
to  strike  is  known  as  the  memory  plane.  Thus  a memory  plane  is  analogous  to 
a bit-plane  in  our  earlier  designs.  Each  memory  plane  is  divided  into  a number 
of  lenslet  fields  (see  Fig.  17).  A lemslet  field  is  analogous  to  chips  in  our 
previous  design.  A lenslet  field  is  further  divided  into  pages.  A module 
address  therefore  has  two  components  - a field  address  and  a page  address. 

The  size  of  a module  can  be  varied  within  wide  limits.  This  is  because  the 
memory  plane  of  EBAM  is  unstructured  (unlike  the  bit  planes  of  CCDs  and  bubble 
mesK>ries)  and  the  electron  beam  may  be  directed  to  any  spot  in  the  memory  plane 
without  incurring  significant  time  penalties. 

The  electronics  required  by  EBAM  (i.e.,  the  field  and  page  select  amplifies, 
the  memory  plane  bias  circuits  and  the  power  supplies)  can  be  shared  between 
tubes  in  a multiple  tube  system.  Thus,  the  cost  per  bit  in  a multiple  tube 
system  could  be  substantially  lower  than  in  a single  tube  system.  Multiple 
tube  systems  can  be  operated  in  serial  as  well  as  parallel  mode.  In  serial 
mode  of  operation,  only  one  tube  is  active  at  any  instant  of  time,  but  in 
psrsllel  mode  of  operation,  all  the  tubes  are  active  concurrently.  The  parallel 
mode  of  operation  gives  rise  to  a high  rate  of  data  transfer  while  reducing  the 
amount  of  electronics  that  can  be  shared.  We  have  chosen  the  parallel  mode  of 
operation  not  because  of  the  resultant  high  data  rate— the  data  rate  of  a single 
tube  is  adequate  for  our  purposes— but  because  our  modules  are  bit-interleaved 
to  form  words. 

a.  Design  Algorithm  for  EBAM-based  Bucket  Memory  - The  design  algorithm  for 
EBAM-based  bucket  memories  Is  simpler  than  those  of  bubble  and  CCD  designs. 


1 , tth  ).|)>  |-  Itl  W 
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Input  Arguments:  1)  Bucket  Memory  Size 

2)  Number  of  processors 

3)  Module  Size 

4)  Word  size  of  processor 

Step  1:  From  the  bucket  memory  size  end  the  number  of  processors, 

determine  the  size  of  memory  unit  attached  to  each  of  the 
processors  by  using  Table  IV. 

Step  2:  Choose  a tube  size  (this  will  In  the  range  10  -30  M bits)  and 

then  determine  the  number  of  tubes  required  for  the  bucket 
mesK>ry  by  using  Table  XI. 

Step  3:  Divide  the  total  number  of  tubes  by  the  number  of  processors 

to  obtain  the  number  of  tubes  per  processor. 

Step  4:  Knowing  the  word  size  and  module  size,  determine  the  page  size 

as  follows: 

Module  Size  “ Page  size  x Number  of  tubes  in  parallel. 


b.  Design  Example  ^ 

Bucket  size  ■ 9.6  x 10  bytes 
No.  of  processors  ” 16 
Module  size  > 16  K bytes 
Word  size  - 16 

8 , 

9 6 X 10  ' 

size  of  Memory  attached  to  a processor  ■ - bytes  ■ 6 x 10  bytes. 

Let  tube  size  be  30  M bits.  Number  of  tubes  required  ■ “ 256. 

Nuad>er  of  tubes  per  processor  - (256)/ (16)  - 16  tubes. 

Page  size  - (Module  Size) /(Word  Size)  - (16x1024)/ (16)  ■ 1024  bytes. 
Nuid>er  of  modules  per  processor  ■ (Memory  attached  to  a 

processor) /Module  Size  ■ (6  x 10^)/ (16  x 10^) 

- 375 


c.  Reliability  Consideration  - The  failure  rate  of  EBAM  tubes  Is  reported  to 
be  20Z  In  a 20,000  hour  replacement  period  [9].  In  the  design  example,  we  had 
a tube  count  of  256.  Thus  we  get  an  effective  MTBF  of  about  400  hours.  This 
figure  compares  very  favorably  with  the  MTBF  of  other  memory  syatems.  Of  course. 
In  our  design,  we  had  aasumed  fairly  large  tube  sizes,  which  kept  the  tube  count 
rather  low.  But  even  for  somII  tube  sizes  (say,  10  Mbits),  Che  device  count  Is 
not  greater  than  800,  giving  ua  a worst  case  of  MTBF  of  about  250  hours. 

D.  BMS  Designs  In  Retrospect 

The  main  objective  of  Che  design  exercises  In  the  preceding  sections  was 
to  Investigate  the  suitability  of  the  three  leading  contenders  for  the  reallsa- 


1 


Step  1: 

Step  2: 
Step  3i 
Step  4: 
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tlon  of  the  bucket  memory.  From  these  exercises  we  have  learned  a number  of 
things.  Let  us  elaborate.  All  three  technologies  may  be  applicable  to  the 
problem  at  hand;  however,  the  range  of  applicability  Is  different  for  different 
technologies.  For  example,  bubble  memories  and  CCDs  are  feasible  only  for 
realizing  small  to  medium  size  bucket  memories.  EBAM  systems  on  the  other 
hand,  are  economical  only  for  larger  bucket  memories.  In  the  case  of  bubble 
systems,  the  shift  rate  Is  too  low,  so  we  cannot  have  very  large  size  loops 
or  a great  many  number  of  loops  In  one  chip.  This  implies  small  size  chips, 
which  In  turn  Increases  device  count.  We  saw  earlier  that  device  count  has 
a profound  effect  on  reliability.  In  the  case  of  CCDs,  we  found  that 
Increasing  chip  density  meant  sacrificing  on  access  times — something  that  we 
can  111  afford.  Hence,  CCDs  become  doubtful  starters  for  large  bucket  memories. 
Why  la  EBAM  economical  only  for  large  systems?  Because,  If  the  bucket  memory 
size  Is  small,  the  number  of  tubes  per  processor  is  proportionately  small. 

Since  the  tubes  attached  to  a processor  share  the  electronics.  It  Is  un- 
economical to  have  a low  level  of  sharing.  Further,  If  the  number  of  tubes 
per  processor  falls  below  the  word  size,  we  might  encounter  reliability 
problems.  The  reader  auiy  observe  that  all  the  problems  described  above 
have  their  origin  In  access  times  end  device  capacity.  If  the  access  time 
of  a memory  organization  using  a particular  technology  Is  not  within  the 
constraints  Imposed  by  our  application,  we  cannot  use  that  organization.  If 
the  capacity  of  the  smallest  Integral  unit  of  memory  realized  by  a technology 
la  large,  then  we  cannot  use  that  technology  economically  to  build  small 
memory  systems. 

In  our  design  algorithms,  we  have  not  calculated  the  cost  of  each  of 
the  ayatama  which  should  be  part  of  any  viable  design.  Such  analysis  is 
being  conducted  In  conjunction  with  an  analysis  of  the  architecture.  The 
presentation  of  the  above  deaian  should  be  regarded  as  a first  level  attempt 
at  utilizina  emeraina  technoloaiea. 

3.2.3  The  Processing  Bleawnts  of  (PEs)  of  the  BMS 


I 


A.  The  Muiri>er  and  Nature  of  the  PI  - The  three  design  algorithms  presented  earlier 
require  the  number  of  processlag  elomancs  as  an  input  argument.  This  ntari>er 
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cannot  be  arbitrarily  chosen.  For  example.  In  case  of  an  EBAM-based  10  -byte 
bucket  memory,  the  number  of  processors  for  a tube  size  of  30M  bits  cannot 
be  greater  than  27  (sec  Table  XI).  In  other  words,  an  upper  bound  on  the 
nuiri>er  of  processors  that  may  be  chosen  Is  imposed  by  the  size  of  memory 
unit  that  can  be  attached  to  a processor.  In  the  case  of  EBAM  systems,  a 
tube  Is  the  smallest  memory  unit  that  can  be  attached  to  a processor.  A 
lower  bound  Is  obtained  by  considering  the  processing  capacity  of  the  processor 
and  the  retrieval  time  of  the  storage  system.  Recall  that  we  required  a 
1 ms  retrieval  time  for  an  arbitrarily  selected  keyword  directory  entry. 

If  several  modules  have  to  be  searched  on  the  average  in  order  to  retrieve  a 
directory  entry,  then  It  may  be  desirable  to  increase  the  number  of 
processor.  Thus,  a bucket  can  be  worked  upon  by  a large  number  of  processors. 
Each  of  the  processors,  then,  need  to  search  over  a smaller  area,  possibly 
a single  module. 

In  choosing  processors  to  manipulate  the  bucket  memory,  consideration 
should  be  given  to  the  compatibility  of  the  processor  and  memory.  For 
example,  the  access  times  and  transfer  rate  In  case  of  EBAM  systems  are 
very  good.  Hence,  fairly  powerful  and  fast  processing  elements  must  be 
employed.  On  the  other  hand,  magnetic  bubble  memory  systems  have  poor 
access  and  transfer  times.  Therefore,  slower  processors  may  be  adequate. 

Also,  the  decision  of  whether  the  data  in  the  BMS  Is  to  be  processed  on- 
thc-fly  or  In  a buffered  mode  can  be  made  only  when  the  speed  compatibility 
issue  Is  determined.  In  the  ensuing  discussions,  we  describe  the  data 
structures  maintained  by  each  of  the  processors  and  the  algorithms  executed 
by  them.  These  discussions  are  Independent  of  processor  technology  (i.e.. 
Independent  of  whether  the  processing  elements  are  made  of  array  logic  (LSI) 
or  random  logic  (MSI)).  The  algorithms  are  designed  to  be  executed  on- 
the-fly.  However,  they  may  be  executed  on  buffered  data  also  If  the  processor 
Is  unable  to  keep  up  with  the  retrieval  speed  of  the  storage  system. 

B.  Data  Structures  Maintained  by  PE 

Earlier,  we  have  made  two  Important  statements  regarding  physical  buckets. 
First,  buckets  must  be  of  variaible  sisc  which  are  realized  with  relatively 
small  physical  blocks.  Second,  in  the  design  of  the  BMS  storage,  we  define 
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modules  to  be  the  smallest  accessible  units.  Thus,  In  the  PEs,  we  need  a 
mechanism  to  allocate  modules  to  physical  buckets  and  access  the  modules 
allocated  to  a physical  bucket. 

In  order  to  Identify  the  modules  allocated  to  a physical  bucket 

the  PE  maintains  certain  data  structures.  These  data  structures  are  kept  In 

fast  random  access  memories  using  bipolar  technologies.  As  we  shall  see, 

the  memory  requirement  of  these  data  structures  Is  small  enough  to  afford  us 

relatively  expensive  high-speed  RAMs.  The  translation  of  a physical  bucket 

name  to  a set  of  module  numbers  Is  carried  out  with  the  help  of  a hash  vector 

and  a module  allocation  table  (MAT)  as  depicted  In  Figure  18.  The  low  order 

bits  of  the  physical  bucket  name  Is  used  to  select  an  entry  In  the  hash  vector. 

The  entry  consists  of  a module  number  R which  Is  also  a pointer  to  the 

po 

R -Ch  entry  In  Che  MAT.  If  Che  low  order  bits  of  two  bucket  names  are 
po 

Identical,  then  they  both  select  the  same  entry  In  the  hash  vector.  The 
overflow  Is  handled  by  chaining  In  the  MAT.  The  high  order  bits  of  an  entry 
In  Che  MAT  Identify  Che  physical  bucket  to  which  Che  module  represented  by 
Che  entry  Is  allocated.  As  shown  In  Figure  18,  the  module  R^^  represented 
by  the  R^-ch  entry  In  the  MAT  Is  allocated  to  the  physical  bucket  with  Che 
name  (h^,h^).  The  low  order  bits  of  the  entry  In  the  MAT  Is  used  as  a pointer 
to  Che  next  entry  In  the  MAT  which  contains  allocation  Information  about  a 
bucket  whose  name  Is  mapped  to  the  same  entry  In  the  hash  vector  as  the  first 
one.  It  Is  possible  that  more  than  one  entry  In  the  MAT  has  allocation 
Information  for  the  same  bucket  (called  multiple  allocation)  as  shown  In 
Figure  19. 

In  summary,  the  module  nunri»er  Rp^  serves  a dual  function.  First,  It 
indicates  that  Rp^  has  been  allocated  to  a bucket.  Second,  It  serves  as  a 
pointer  to  the  next  entry  In  the  MAT  which  contains  allocation  Information 
about  the  same  bucket  or  another  bucket  whose  name  has  the  same  low  order 
bits  as  the  first  one.  We  also  note  that,  the  modules  nuad>eTed  0 through  r-1 
are  not  available  for  allocation.  These  modules  are  used  as  back-up  memory 
for  the  data  structures  of  the  PE.  The  value  of  r is  determined  by  the  number 
of  modules  In  a memory  unit  and  the  else  of  each  of  the  modules.  The  location 
0 of  the  MAT  is  used  as  the  head  of  a list  of  available  modules. 

In  order  to  keep  track  of  the  amount  of  space  available  in  each  of  the 
modules,  the  PB  maintains  a module  space  table  (MST)  as  shown  in  Figure  20. 

The  n-th  entry  In  the  MST  records  the  number  of  bytes  available  in  the  n-th 
module. 


Location  0 of  MAT  ia  not  allocated  to  jny  bucket.  It  Is  used 
to  clain  all  the  available  modules  as  shown  by  the  dotted  line. 


Figure  18.  Mapping  a Phyaical  Bucket  Name  into  a Set  of  Module  Numbers. 


C.  Sizes  and  Load  Conditions  of  the  Data  Structures 

The  number  of  entries  In  the  hash  vector  could  assume  several  possible 
values  with  different  performance  figures.  For  example.  If  the  number  of 
modules  available  In  a memory  unit  Is,  say,  n,  the  number  of  entries  In  the 
hash  vector  could  be  O.ZSn,  0.5n,  n,  2n,  4n,  etc.  The  larger  Is  the  vector 
size,  the  lower  Is  the  probability  of  two  distinct  (low  order  bits  of)  bucket 
names  hashing  to  the  same  entry  In  the  hash  vector.  The  number  of  bits  required 
In  each  entry  of  the  hash  vector  depends  on  the  nunber  of  modules  In  the  memory 
unit  attached  to  a PE.  In  Table  XII,  we  have  tabulated  the  number  of  modules 
required  as  a function  of  memory  unit  size  and  module  size.  The  memory  unit 
sizes  are  as  they  appear  In  Table  V. 

In  Table  XIII,  we  tabulate  the  number  of  bits  required  by  each  entry  In 
the  hash  vector  and  the  number  of  bits  required  to  address  the  hash  vector  In 
parenthesis  for  various  hash  vector  sizes.  We  also  tabulate  for  each  hash 
vector  size  two  performance  figures  under  extremely  light  and  extremely  heavy 
conditions.  The  two  performance  figures  pertain  to  the  nunber  of  entries 
that  must  be  searched  until  a particular  entry  Is  located  or  Is  determined  not 
to  be  in  the  table.  This  quantity  is  called  the  number  of  probes.  It  has 
been  shown  in  [10]  that  for  a hash  vector  using  overflow  chaining  the  average 
number  of  probes  needed  to  find  an  entry  is  C where 

C - 1 + LF/2 

and  the  average  nunber  of  probes  needed  to  discover  that  a particular  entry 
is  not  in  the  hash  table  is  C*  where 

C - LF  + exp(-LF) 

Here,  LF  is  defined  as  the  ratio  of  the  number  of  allocated  entries  in  the 
HAT  to  the  nunber  of  entries  in  the  hash  vector.  When  the  MAT  is  allocated 
to  0.9  of  its  capacity,  then  we  shall  consider  it  heavily  loaded.  When 
the  table  has  only  lOZ  of  entries  allocated,  then  we  shall  consider  it 
lightly  loaded. 
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Tabl*  XII  : NO.  of  nodules  In  e nenory  unit  as  s function  of 

the  nenory  unit  size  and  nodule  size 

Menory 


ModuleV  Unit  Size 


L60 

320 

400 

800 

1600 

3200 

4000 
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80 

160 

200 

400 

800 

1600 
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400 
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40 
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320 
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134 
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Table  Xin.  Number  oi  bite  required  In  the  hash  vector  and  number  of  bits  required  to 
address  the  hash  vector  as  a function  of  normalized  hash  vector  size 
and  number  of  modules. 


Hash  Vector  Size 


Heavy 

Load 


Actual 


Normalized  Hash  Vector  Size 


* of  Register  Modules 


(A)  Number  of  Bits  required  to  Address  the  Hash  Vector 


f Logj  (Number  of  Modules)  * (Normalized  Hash)*/ 

(Vector  Size  ) 


(B)  Number  of  bits  In  each  entry  of  the  Hash  Vector 

/Logj  (Number  of  Modulci 
(See  text  for  definition  of  C and  C' 


Number  of  bits  in  each  entry  In  the  Module  Allocation  Table 


PhystosI  Bnekst  Name  Size  - A ♦ B 
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1 

C 
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1.9 
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1 
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_ 0.9  X Total  Number  of  entries  In  MAT 
Number  of  entries  In  Hash  Vector 


. 0.1  X Total  Number  of  entries  In  MAT 
Number  of  entries  In  Hash  Vector 


id  Hash 
Size 


0j9 

Normalized  Hash  Vector  Size 


OJ^ 

Normalized  Hash  Vector  Size 

j From  Table  XIII,  we  find  that  by  increasing  the  relative  hash  vector  size  we 

get  better  performance  in  terms  of  smaller  number  of  probes.  However,  the 
percentile  Increase  in  performance  Is  not  the  same  each  time  we  double  the  bash 
vector  size.  For  example  in  going  from  a normalized  hash  vector  size  of  0.25 
to  0.5  the  percentile  drop  in  the  number  of  probes  C,  is  322,  but  in  going 
from  0.5  to  1.0  the  percentile  drop  is  only  about  232. 
j In  Table  XIX  we  tabulate  memory  requirements  and  performance  gain/loss 

j achieved  for  various  sizes  of  the  hash  vector.  We  have  arbitrarily  chosen  the 

performance  of  a hash  vector  whose  normalized  size  is  1 to  have  unity 
Performance.  From  the  table  we  see  that  the  performance  under  heavy  load 
conditions.  Improves  by  28-302  when  the  hash  vector  size  is  made  4 times  the 
size  of  the  MAT.  However,  the  space  requirement,  in  going  from  a relative 
hash  vector  size  of  1 to  4,  Increases  from  240  to  460  bytes  (a  1002 
Increase)  for  80  modules  and  from  64K  to  160K  (an  increase  of  1502)  for  16000 
modules.  Clearly,  the  perfonaanca  improvement  is  not  cost-effective.  Under 
light  load  conditions,  the  performance  ImproveMnt  is  even  less  cost-effective. 
In  fact,  under  light  load  conditions  we  can  get  as  high  as  87.52  to  932  of  the 
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Total  Number  of  entries  in  MAT 


Thus, 


Now,  let 


Thus, 


LF 


heavy 

load 


LF 


light 

load 


Normalize 

Vector 


LF 


heavy 

load 


LF 


light 

load 
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performance  with  a hash  vector  which  la  one  fourth  the  size  of  the  unity 
performance  hash  vector.  But  under  heavy  load  conditions,  small  vector  sizes 
can  really  hurt  performance  (51Z-35.9Z).  Thus  the  choice  of  the  relative 
vector  size  narrows  down  to  between  0.5  and  2.  We  shall  not  attempt  to  narrow 
down  the  range  any  further  at  this  point,  since  a more  detailed  analysis  in 
terms  of  memory  cost -and  overall  performance  requirements  is  needed. 


Logic  of  the  PE 

The  orders  issued  by  the  SMC  to  the  PEs  are  as  follows: 

1)  Find  the  number  of  modules  in  use  for  the  physical  bucket 
named  m. 

2)  Find  the  total  number  of  modules  available. 

3)  Look  up  physical  bucket  m for  a transformed  keyword  whose 

low  order  bits  have  the  value  T. . 

I 

4)  Retrieve  from  physical  bucket  m the  directory  entry  of  a 
transformed  keyword  whose  low  order  bits  have  the  value 


5)  Retrieve  all  index  terms  from  the  directory  entries  in  the 
physical  bucket  m. 

6)  Delete  from  physical  bucket  m the  index  term  1 for  the 
transformed  keyword  whose  low  order  bits  have  the  value  T^. 

7)  Insert  Into  module  j of  physical  bucket  m the  index 

term  (i,(q))  for  the  transformed  keyword  whose  low  order 

bits  have  the  value  T. . 

I 

8)  Create  an  entry  in  physical  bucket  m for  the  transformed 
(security/clustering)  keyword  whose  low  order  bits  have  the 
value  T^  and  the  index  is  (i,(q)). 

For  each  of  the  above  orders  there  is  one  corresponding  algorithm  which  is 
executed  by  the  PEs.  The  first  two  orders  are  used  by  the  SMC  to  ensure  that 
modules  of  a physical  bucket  are  uniformly  distributed  over  all  the  memory 
units. 


ALGORITHM  A:  To  find  the  number  of  su>dules  in  use  with  a bucket  end 

determine  the  space  available  in  each  of  the  modules. 
(Result  of  the  algorithm  is  available  in  an  array  celled 
RESULT.) 


I 


I 

I 

I 

I 

i 

[ 


Input  Arguments:  Bucket  number  m 

Step  1:  r •«-  0 

Step  2:  Use  the  appropriste  number  of  bits  in  m to  index  the  hssh 

vector  table  (see  Figure  18).  If  the  entry  is  empty,  go  to 
Step  7;  else,  extract  the  pointer  from  the  entry.  Call  it  R . 

Step  3:  If  the  high  order  bits  in  the  R -th  entry  of  MAT  matches  ^ 

the  high  order  bits  of  m,  then  ^ go  to  Step  4;  else,  go  to 
Step  S. 

Step  4:  rn+l;  RESULT(rKR  ,MST{R  ]. 

Step  5:  R -*-MAT(R  1.  (i.e.,^  extract  the  pointer  to  the  next  entry  in 

MAT)  ^ 

Step  6:  If  R *0,  then  go  to  Step  7;  else,  go  to  Step  3. 

Step  7:  RESUl9[0]-i^.  Terminate. 

Response:  RESULT[0]  contains  the  number  of  modules  allocated  to  a physical 

bucket.  RESULT [1]  through  RESULT [r]  indicate  the  space 
availability  in  each  of  the  modules. 


ALGORITHM  B:  To  find  the  number  of  modules  available  in  the  memory  unit 

attached  to  a PE.  (Each  PE  maintains  a count  called  COUNTER  to 
keep  track  of  the  number  of  modules  available.  This  counter  is 
manipulated  by  other  algorithms). 

Step  1:  RESULT  [0]-«-C0UNTER.  Terminate. 


I 

I 

t 

I 

I 


ALGORITHM  C:  To  look  up  bucket  m for  a transformed  keywork  whose  low  order 

bits  have  the  value  T^. 

Input  Arguments:  Bucket  name  m,  and  low  order  bits  T^. 

Step  1:  SEARCHFLA&Hfalse 

Step  2:  Call  Algorithm  A to  obtain  the  set  of  modules  allocated  to 

bucket  m. 

Step  3:  If  RESULT[0]*0,  then  go  to  Step  8;  else,  go  to  Step  4. 

Step  4:  r«-RESULT[0] ; 

Step  5:  Search  the  module  given  by  RESULT[J]  for  a directory  entry  whose 

transformation  field  matches  the  argument  T..  If  match  occurs, 
then  go  to  Step  7;  else,  go  to  Step  6. 

Step  6:  j-«-j-*-l{  if  jSr,  then  go  to  Step  5;  else,  go  to  Step  8. 

Step  7:  SEARCHFLAG4-'true';RESULT(lKRESULT[J] 

RESULT  [2 ]<»MST [RESULT  [j]] 

Step  8:  RESULT [Oj<«-SEARCHFLAG;  Terminate. 


t 

II 

y 

I 

I 


Note:  In  the  above  algorithm  Step  5 is  a crucial  one,  as  it  is  responsible 

for  the  search  operation.  Searching  can  be  done  on  the  fly  or  on  a 
buffered  basis  depending  on  processor  speed  end  readout  times  for  a 
module 

ALGORITHM  D:  To  retrieve  from  physical  bucket  in  the  directory  entry  of  e 

transformed  keyword  whose  low  order  bits  is  T^. 

Input  Arguments:  Bucket  name  m,  low  order  bits  of  transformed  value  T^. 

Step  1:  SEARCHFLAO*-' false* 

Step  2:  Use  the  hash  vector  end  module  allocation  table  (Figure  IS)  to 

obtain  the  module  number  of  the  next  module  allocated  to  m. 

Cell  it  j. 

Step  3t  If  Step  k does  not  yield  e module  name  go  to  Step  6. 


I 

i 
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Step  4:  Search  module  J for  the  directory  entry  of  a transformed 

keyword  whose  low  order  bits  are  given  by  T^.  If  a match  la 
found,  go  to  Step  5;  else,  go  to  Step  2. 

Step  5:  SEARCHFLAC^-'true* ; store  directory  entry  in  RESULT  erray. 

Step  6:  RESULT [0]-*-SEARCH7LAG:  Terminate. 

Response:  Directory  entry  in  RESULT  array  (if  found). 

ALGORITHM  E:  To  retrieve  all  the  directory  entries  in  bucket  n. 

Input  Arguments:  Bucket  name  m 

Step  1:  SEARCHFLAG^' false* 

Step  2:  Use  the  hash  vector  and  module  allocation  table  (Figure  18)  to 

obtain  the  number  of  the  next  module  allocated  to  m.  Call  It  J. 

Step  3:  If  Step  2 does  not  yield  a module  name,  go  to  Step  6. 

Step  4:  Readout  module  j and  store  Its  contents  In  RESULT  array. 

Step  5:  SEARCHFLAG’t-'true* . Go  to  Step  2. 

Step  6:  R£SULT[0]-^  SEARCHFLAG  ; terminate. 

Responae:  All  directory  entries  in  RESULT  array. 

ALGORITHM  F:  To  delete  from  physical  bucket  on  the  Index  term  1 for  the 

transformed  keyword  whose  low  order  bits  have  the  value  T^. 

Input  Arguments:  module  m.  Index  1,  transformed  value's  low  order 

bit  T^. 

Step  1:  Call  Algorlttmi  A to  find  all  modules  allocated  to  bucket  m. 

Step  2:  If  RESULT(0]«0,  then  tensinate. 

Step  3:  j-^1;  r««ESULT[0]. 

Step  4:  Search  module  Identified  by  RESULT [J]  for  the  directory  entry 

of  a transformed  keyword  whose  low  order  bits  have  the  value 
T,.  If  a match  occura,  go  to  Step  6;  else,  go  to  Step  5. 

Step  5:  4-  1;  if  j<r,  go  to  Step  4;  else,  terminate. 

Step  6:  [Match  haa  occurred.]  Look  for  Index  term  1 within  the  entry. 

If  found,  delete  It.  If  1 Is  the  last  index  In  the  entry,  then 
go  to  Step  7;  else,  go  to  Step  9. 

Step  7:  [Entry  to  be  deleted.]  EFLAG<-1.  Reduce  count  of  nund>er  of 

entries  In  module  by  1.  If  number  of  entries  Is  xero,  then  go 
to  Step  8;  else,  go  to  Step  9. 

Step  8:  [Module  to  be  freed. ] Use  the  hash  vector  to  trace  the  chain  of 

pointers  for  the  bucket  m.  When  the  J-th  entry  Is  accessed, 
update  the  entry  which  Is  pointed  to  the  J-th  entry  (celled  the 
predecessor)  by  copying  the  contents  of  the  J-th  entry  into  the 
predecessor.  Link  up  the  J-th  entry  Into  the  AVAIL  list.  [The 
BK>dule  Is  now  available  for  reallocation. ] 

Step  9:  Increase  M8T[J]  by  the  length  of  Index  term  Just  deleted.  If 

EFLAC  ■ 1,  then  increase  MST[J]  by  the  number  of  bytea  occupied 
by  the  transformed  value  and  index  term  count. 

EFLAO^.  Terminate 

Response:  None 

ALGORITHM  G:  To  insert  into  module  J of  physical  bucket  m,  the  Index  term 

(i,  (q))  for  the  transformed  keyword  whose  low  order  bits  have 
the  value  T^. 

Input  Argument:  Bucket  name  m,  module  nusiber  J,  Index  term  (1,  (q)) 

end  transformed  value 
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A call  to  this  algorlthn  la  always  pracadad  by  a call  to 
Algorithm  C which  looks  up  the  bucket  containing  the  keyword 
entry  into  which  the  index  tarn  is  to  be  Inserted.  Recall 
that  Algorithm  C returns  the  name  of  the  module  containing 
the  entry  if  the  lookup  is  successful,  l.e.,  SEARCHFLAG  ■ 

'true' . 

Step  1:  Read  module  j into  buffer  if  it  not  already  in  the  buffer.  [It 

is  quite  likely  that  the  information  in  module  J is  already  in 
the  input  buffer,  since  Algorithm  C searched  the  module  before 
the  invocation  of  this  algorithm. ] 

Step  2:  Scan  index  terms  of  the  entry  for  the  transformed  value  T^. 

If  i already  exists,  then  terminate. 

Step  3:  Move  the  directory  entries  of  other  keywords  in  the  buffer  to 

make  space  for  the  current  index  term  (1,  (q}>.  (There  will 
always  be  enough  room,  since  the  SMC  does  not  issue  this 
order  if  there  is  not  enough  space  in  the  module. ] 

Step  4:  Update  MST[J]  to  reflect  the  space  occupied  by  the  new  index 
term.  Terminate. 

Response : None 

ALGORITHM  H:  Create  an  entry  in  bucket  m with  index  term  (1,  (q))  for  the 

(security/clustering)  keyword  whose  low  order  bits  have  the 
transformed  value  T,. 


Input  Arguments: 
Step  1:  Call  A1 


bucket  m. 


1’ 

Index  term 


(i,  (q)) 


Step  2: 


Step  3: 


Step  4; 
Step  5: 


Step  6: 


Call  Algorithm  A to  find  the  set  of  modules  already  allocated  to 
bucket  m.  If  RESVLT[0]  ■ 'false',  go  to  Step  3. 

Find  from  the  set  of  modules  identified  by  RESULT [1]  through 
RESULT  [r],  the  module  k which  has  the  largest  space  available 
and  exceeds  the  space  required  by  the  entry.  If  no  such  module 
exists,  go  to  Step  3;  else,  go  to  Step  4. 

[Allocate  new  module]  From  the  AVAIL  list,  pick  up  the  first 
available  module  k,  and  link  it  up  in  the  chain  emanating  from 
the  hash  vector  entry  for  m.  (see  Figures  18  & 19).  Go  to 
Step  5. 

Read  the  module  k chosen  in  Step  2. 

Pla-;c  the  argument  entry  in  the  first  available  location  Indicated 
by  MST[k].  Update  MST[k]  to  reflect  the  space  occupied  by  the  new 
entry. 

Write  back  the  module  k.  Terminate. 


Response : None 

E.  PEs  and  the  Structure  Memory  Controller 

The  above  algorithms  have  been  carefully  tailored  to  eliminate  much  of  the 
decision  making  tasks  from  the  logic  of  the  PEs.  The  algorithms  are  only 
capable  of  transmitting  the  information  t»  the  SMC  trhich  makes  decisions.  These 
decisions  pertain  to  uniform  distribution  of  keyword  directory  entries  and 
taking  care  of  overflows  in  nodules,  etc.  For  example,  SMC,  before  making  an 
insert  request,  orders  the  PEs  to  see  If  the  keyword  already  has  an  entry,  end 
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to  determine  the  amount  of  space  available  In  the  module  containing  the  entry. 
Based  on  these  two  pieces  of  Information,  the  SMC  can  make  a decision  on  where 
to  Insert  the  Index  term.  We  shall  see  more  of  this  decision  making  procedure 
In  the  algorithms  of  the  SMC. 

Host  of  the  algorithms  need  irorklng  space  (for  Insertions,  deletions, 
etc.)  The  working  space  required  Is  of  the  order  of  the  size  of  a module.  In 
some  Instances,  particularly  In  the  "Retrieve  All"  order  (Algorithm  E)  the 
amount  of  data  retrieved  may  exceed  the  work  space  available.  In  such  cases, 
the  retrieval  Is  done  In  burst  mode,  l.e.,  when  the  work  space  fills  up.  It  Is 
transmitted  to  the  SMC  before  more  retrieval  Is  attempted. 

3.3  The  Structure  Memory  Controller  (SMC) 

The  SMC  Is  responsible  for  carrying  out  the  following  functions: 

• Transformation  of  logical  bucket  name  to  physical  bucket  name 

• Controlling  the  bucket  memory  system  (BMS) 

• Controlling  the  structure  memory  Information  processor  (SMTP) 

• Maintaining  the  look-aside  buffer  memory  system 

• Executing  the  following  DBCCP  commands 

• Retrieve  Index  term  for  a keyword  predicate 

• Retrieve  with  count  the  Index  terms  for  a (Security/ 
clustering)  keyword 

• Insert  Index  term  for  a keyword 

• Delete  Index  term  for  a keyword 

• Reset 

The  algorithms  executed  by  the  SMC  are  sequential  In  nature;  parallelism  Is 
Induced  by  the  simultaneous  invocation  of  the  PEs  of  the  BMS.  The  critical 
requirement  of  the  SMC  hardware  Is  that  It  should  be  able  to  move  Information 
In  and  out  of  the  SM  and  make  decisions  at  a rate  which  Is  comparable  to  the 
rate  of  flow  of  Information  from  the  PEs. 


i 

I 


1 

.1 
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3.3.1  Transformation  of  Logical  Bucket  Name  to  Physical  Bucket  Name 

The  rationale  for  transforming  logical  bucket  names  to  physical  bucket 
names  haa  already  been  put  forth  In  an  earlier  aection.  Here  we  dlscuaa  Its 
Implementation.  In  Figure  21,  we  detail  the  basic  data  structure  required  to 
laipleaMnt  the  mapping.  The  structures  are  similar  to  those  Involved  In  mapping 


a physical  bucket  Into  a sat  of  module  names.  The  main  difference,  here.  Is 
chat  for  a given  logical  bucket  name  there  Is  only  one  physical  bucket  name. 


f 


Logical  Bucket  Name 


Physical  Bucket  Name  Table  (PBNT) 


Figure  21.  Logical  to  Physical  Bucket  Transformation 
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The  m low  order  bit  of  Che  logical  name  Is  used  to  access  the  Index  vector. 
The  corresponding  entry  In  the  index  vector  contains  a pointer  to  Che  physical 
bucket  name  known  to  the  SMC.  Thus,  (since  we  had  assumed  a 16  bit  physical 
bucket  name  in  earlier  sections),  we  need  2^^  entries.  Each  entry  has  the 
(24-m)  high  order  bits  of  the  logical  bucket  name  to  which  Che  physical  bucket 
corresponds,  and  a pointer  to  the  next  entry  in  the  table  which  is  allocated 
to  a logical  bucket  name  with  the  same  m low  order  bits  as  the  first  logical 
bucket  name.  This  is  clear  from  Che  chaining  for  the  logical  bucket  name 
(bh»b^)  show  in  the  figure.  The  larger  the  value  of  m,  the  faster  is  the 
mapping  process,  since  a larger  index  vector  implies  fewer  probes  of  Che  FBNT. 
An  analysis,  similar  to  one  conducted  for  the  module  allocation  cable,  can  be 
made  for  the  PBNT.  Since  the  PBNT  is  a larger  Cable,  it  is  not  feasible  to 
have  an  index  vector  with  the  same  number  of  entries  or  more.  Tolerable 
performances  can  be  obtained  by  having  an  index  vector  with  about  one  eighth 
CO  one  sixteenth  the  number  of  entries  in  Che  PBNT.  [The  average  number  of 
probes  for  a successful  match  will  be  about  5 and  9,  respectively,  for  90% 
load  on  Che  PBNT].  The  memory  requirement  for  a PBNT  with  2^^  entries  is  65K 
words;  and  for  Che  index  vector,  the  memory  requirement  is  4K  words  a.id  8K 
words  for  2^^  and  2^*^  entries,  respectively,  [a  word  - 4 bytes]. 


3.3.2  The  DBCCP  commands  executed  by  the  SMC 

Each  of  Che  five  DBCCP  commands  listed  earlier  is  carried  out  b>  an  algorithm. 
For  the  sake  of  clarity,  these  algorithms  are  presented  as  if  the  look-aside 
buffer  did  not  exist.  Later  on,  we  shall  discuss  Che  effect  of  the  look-aside 
buffer  on  each  of  the  algorithms. 


ALGORITHM  A:  To  execute  Che  command,  "Retrieve  index  terms  for  a keyword 

predicate." 

Input  Arguments:  1.  Predicate  type  from  DBCCP 

2.  Transformed  value  of  the  keyword  from  KXU 
[Note  Chat  Che  DPCCP  sends  the  predicate  type  (■,}(,<,<,>)  along  with  the 
command,  while  the  keyword  is  sent  via  KXU  where  it  is  transformed  as 
described  in  Section  2.] 

Step  1:  Obtain  the  transformed  keyword  value  T(K)  from  the  KXU. 

Step  2:  If  the  predicate  is  or  then  go  to  Step  3;  else,  go 

to  Step  7. 

Step  3:  Use  the  index  vector  and  PBNT  to  obtain  the  name  of  the  physical 

bucket  corresponding  to  the  logical  bucket  name.  Call  it  m. 

Step  4:  Issue  the  order  to  the  PEs,  "Retrieve  from  physical  bucket  m, 

the  directory  entry  of  a transformed  keyword  whose  low  order  bits 
have  the  value  T.(K)". 

Step  5:  If  the  predicate  is  then  sand  the  retrieved  indices  to  the 

SHIP  for  futher  processing,  set  DFLAG  0,  and  terminate.  Else, 
go  to  Step  6. 


Step  6: 


Step  7: 
Step  8: 


Step  9: 


Step  10: 


Step  11: 
Step  12: 


Step  13. 


(The  predicate  is  V’]  Send  the  retrieved  indices  to  the  SHIP 
for  further  processing,  set  DFLAG  » 1,  and  terminate.  [See 
Section  4.1  for  the  function  of  DFLAG]. 

[Predicate  is  one  of  '<'].  If  predicate  is  of 

type  or  '4',  then  go  to  Step  12;  else,  go  to  Step  8. 
[Predicate  is  either  *>'  or  '>'].  From  the  logical  bucket  name 
and  in  the  transformed  value,  obtain  the  partition  number  p.  If 
the  predicate  is  then  p+p  + 1. 

Use  the  index  vector  and  PBNT  to  determine  the  names  of  the 
physical  bucket  for  all  the  logical  bucket  names  whose  partition 
numbers  are  equal  to  or  greater  than  p.  Call  this  set  of 
physical  bucket  names  I. 

For  each  physical  bucket  narae  m in  I,  issue  the  order, 
"Retrieve  from  physical  bucket  m,  all  the  index  terms  of  all 
the  directory  entries." 

Send  the  retrieved  indices  to  the  SHIP  for  further  processing 
with  DFLAG  ■ 0.  [See  Section  4.2  for  details.]  Terminate. 
[Predicate  is  either  's'  or  '<'].  From  the  logical  bucket  name 
and  in  transformed  value,  obtain  the  partition  number  p.  If 
the  predicate  is  '<',  then  p^p-1. 

Use  the  index  vector  and  PBNT  to  determine  the  names  of  the 
physical  buckets  for  all  the  logical  bucket  names  whose  partition 
numbers  are  equal  to  or  less  than  p.  Call  this  set  of  physical 
bucket  names  I.  Go  to  Step  10. 


Notes  on  Algorithm  A: 


This  algorithm  provides  a limited  facility  for 
processing  keyword  predicates  which  are  of  the 


type  'S', 


'2  and 


The  limitations 


arise  from  the  manner  in  which  keyword  directory 
entries  are  j>tored  in  the  BMS.  The  reader  will 
recall  from  Section  2,  that  the  partition  number 
in  the  logical  bucket  name  associated  with  a 
keyword,  reflects  a range  (of  values)  within 
which  the  keyword  happens  to  be.  A range  has  a 
low  extreme  and  an  upper  extreme  V.  . 

Algorithm  A will  perform  correctly  under"  the 
following  conditions:  The  predicate  '<'  must  be 


associated  with  the  lower  extreme 


of  some 


range.  The  predicate  'S'  must  be  associated  with 
the  upper  extreme  value  of  some  range.  The 
predicate  '>'  must  be  associated  with  the  upper 
extreme  value  of  some  range.  The  predicate 
'>'  must  be  associated  with  a lower  extreme  value 
V«  of  some  range.  These  four  conditions  are 
not  nearly  as  restrictive  as  they  seem  to  be,  since 
in  practice  range  searches  involving  these 
predicates  are  carried  over  ranges  which  can  be 
predicted  in  advance,  and  therefore  can  be  associated 
exactly  with  partitions.  If  the  second  or  fourth 
condition  is  violated,  more  Indices  will  be 
retrieved  than  necessary.  If  the  first  or  third 
condition  is  violated,  some  relevant  Indices  may 
not  be  retrieved. 


ALGORITHM  B:  To  retrieve  with  count  the  index  terms  for  the  (security/ 

clustering)  keyword. 
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Input  Arguments: 


1.  Count  value  from  DBCCP 

2.  Transformed  value  of  the  keyword  K from  KXU 


[This  algorithm  Is  used  to  find  out  the  name  of  the  cluster  or  security 
atom  to  which  a record  belongs.  The  predicate  associated  with  this 
algorithm  is  always  *■'.  The  count  value  is  the  number  of  clustering/ 
security  keywords  present  in  a record  belonging  to  the  cluster  or 
security  atom. ] 


Step  1: 
Step  2: 

Step  3: 
Step  4: 

Step  5: 


Obtain  the  transformed  keyword  value  T(K)  from  the  KXU. 

Use  the  index  vector  and  PBNT  to  obtain  the  name  of  the  physical 
bucket  corresponding  to  the  logical  name  in  T(K).  Call  it  m. 
Issue  the  order  to  the  PEs  "Retrieve  from  physical  bucket  m,  the 
directory  entry  of  a transformed  keyword  whose  low  order  bits 
have  the  value  T»(K)'’. 

For  each  of  the  index  term  retrieved  In  Step  3,  compare  the 
associated  count  fields  against  the  argument  count  value  supplied 
by  the  DBCCP.  Form  the  set  S of  index  terms  for  which  the 
equality  operation  is  successful. 

Send  the  set  S to  the  SHIP  with  DFLAG  * 0.  Terminate. 


ALGORITHM  C:  Insert  Index  term  for  a keyword. 

Input  Arguments:  1.  Index  term  from  DBCCP. 

2.  Count  value  from  DBCCP  if  applicable  [only 
In  case  of  security  or  clustering  keywords]. 

3.  Transformed  value  of  keyword  from  KXU. 

Step  1:  Obtain  the  transformed  keyword  value  T(K)  from  the  KXU. 

Step  2:  Use  the  index  vector  and  PBNT  to  obtain  the  name  of  the  physical 

bucket  corresponding  to  the  logical  bucket  name.  If  the  PBNT 
falls  to  yield  the  physical  bucket  name,  then  allocate  a 
physical  bucket  from  the  AVAIL  list.  Call  the  bucket  name  m. 

If  bucket  m Is  newly  ellocated,  skip  to  Step  13. 

Step  3:  Issue  the  order,  "Look  up  bucket  m,  for  the  directory  entry  of 

the  transformed  keyword  whose  low  order  bits  have  the  value  T." 
to  all  the  PEs. 

Step  4:  If  the  SEARCHFLAG  of  one  or  more  PEs  Is  'true',  then  go  to  Step 

3;  else,  go  to  Step  7. 

Step  5:  Determine  the  PE  whose  module  has  the  largest  space  available. 

Call  it  j. 

Step  6:  If  the  space  In  j Is  greater  than  that  required  by  the  Index 

term  (and  Its  count  If  applicable),  then  Issue  the  order, 

"Insert  Into  physical  bucket  m,  module  J , the  Index  term 
(1,  (q)}  for  the  keyword  whose  low  order  bits  have  the  transformed 
value  Ta  and  terminate.  If  space  In  j Is  not  enough,  5(o  to 
Step  7. 

Step  7:  [Control  comes  to  this  step  If  no  PE  reports  the  existence  of  the 

directory  entry  of  the  concerned  keyword  In  Step  3,  or  if  the 
space  In  any  of  the  modules  Is  Insufficient  to  contain  the  index 
term  as  determined  In  Step  6].  Issue  the  order,  "Find  the  number 
of  modules  in  use  for  physical  bucket  with  name  m,  to  all  PEs. 

Scop  •:  From  the  response  to  the  order  in  Step  7,  choose  the  set  of  PEs 

which  have  no  module  allocated  to  m and  call  this  set  Z.  If  the 
sac  Z Is  empty,  go  to  Step  12.  [Z  a {p. ,P2> • • • (P_}]< 

«eap  *<  To  each  of  the  PEs  Identified  In  Z,  issue  the  order,  "Find  the 
MMhar  of  modules  available."  Call  the  response  set  Z' 

II*  • {(P2»#j)»  CP2»®2^’  •••! 
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Step  10:  From  Z'  choose  the  one  PE  p'  which  has  the  largest  number  n' 
of  modules  available. 

Step  11:  Issue  to  p',  the  order,  "Create  an  entry  in  bucket  m for  the 
transformed  (security/clustering)  keyword  whose  low  order  bits 
have  the  value  and  the  index  is  (i,  (q)).  Terminate. 

Step  12:  [Control  comes  here  when  all  PEs  have  modules  allocated  to  m]. 

Choose  the  PE(s)  which  has  (have)  the  smallest  number  of  modules 
allocated  to  m.  If  the  choice  is  reduced  to  one  PE  p'  go  to 
Step  11;  else,  let  Z s {p|p  has  the  smallest  number  of  modules 
allocated  to  m).  Go  to  Step  9. 

Step  13:  [Control  comes  here  when  bucket  m has  been  allocated  in  Step  2, 
and  therefore  the  BUS  has  no  entries  in  m].  Let  Z * {all  PEs}, 
go  to  Step  9. 

ALGORITHM  D:  To  delete  an  index  term  for  a keyword 

Input  Argument:  1.  Index  term  1 from  DBCCP 

2.  Transformed  keyword  value  T(K)  from  KXU 


Step 

Step 

Step 

Step 

Step 

Step 


1: 

2: 

3: 


4: 


5: 


6: 


Obtain  the  transformed  keyword  value  T(K)  from  KXU 
Use  the  index  vector  and  PBNT  to  obtain  the  name  of  the  physical 
bucket  corresponding  to  the  logical  bucket.  Call  this  name  m. 
Broadcast  to  all  PEs  the  order,  "Delete  from  physical  bucket  m 
the  index  term  i for  the  transformed  keyword  whose  low  order 
bits  have  the  value  T^CK)". 

[Determine  if  this  was  the  last  entry  in  bucket  m] . Broadcast 
to  all  PEs  the  order,  "Find  number  of  modules  attached  to 
bucket  m." 

If  all  the  PEs  respond  to  the  order  in  Step  4,  with  zero  as  the 
number  of  modules  attached  to  the  bucket  m,  then  m can  be 
reused;  else,  terminate. 

[Physical  bucket  m in  freed].  Place  the  entry  for  m in  PBNT 
in  the  AVAIL  list.  Update  the  chain  for  the  logical  bucket  name 
in  the  index  vector.  Terminate. 


ALGORITHM  £:  To  execute  the  comnand  'reset. 


Input  Argument : None 

[This  algorithm  is  primarily  Intended  to  signal  the  SMC  that  a string  of 
retrieve  commands  is  ended.  The  DBCCP  always  presents  the  SM  with  a 
string  of  retrieve  commands  corresponding  to  a query  conjunct  or  a 
set  of  keywords  in  a record]. 

Step  1:  Issue  a command  to  the  SHIP  to  retrieve  all  valid  data  units 

from  its  memory  and  then  clear  the  SHIP  memory.  Terminate. 


3.3.3.  The  Relationship  of  SMC,  SMTP,  DBCCP,  and  KXU. 

Ue  have  seen  data  structures  maintained  by  and  algorithms  executed  by  the 
SMC.  These  algorithms  concern  themselves  with  the  retrieval  (or  insertion)  of 
information  from  (into)  the  array  of  PEs.  Information  so  retrieved  is  passed 
on  to  the  SMTP;  and  the  Information  to  be  placed  in  the  SM  is  received  from  the 
KXU  and  DBCCP.  The  overall  flow  of  the  information  through  the  SMC  is 
schematically  shown  in  Figure  22. 


-79- 


The  PEs  of  the  BMS  can  be  selectively  activated  by  the  SMC  using  a mask 
register.  There  Is  one  bit  In  the  mask  register  for  each  PE.  An  order  Issued 
by  the  SMC  Is  received  by  all  the  PEs,  but  only  those  which  have  the  correspond- 
ing mask  bit  turned  on,  carry  out  the  order.  The  bits  In  the  mask  register  can 
be  set  (or  reset)  by  the  SMC  on  an  individual  basis.  Data  transfer  between  the 
PE  and  the  SMC  takes  place  over  a data  bus  called  the  SMPEBUS  (see  Figure  22). 
Such  data  transfer  Is  always  In  response  to  or  a part  of  orders  broadcast  by 
the  SMC.  Thus,  communication  between  the  SMC  and  the  PEs  Is  In  a master-slave 
relationship,  with  the  SMC  assuming  the  master  role  and  the  PEs  assuming  the 
slave  role.  The  design  of  the  control  function  and  the  SMPEBUS  is  conventional 
and,  therefore,  will  not  be  described  here. 

The  relationship  between  the  SMC  and  the  SMTP  is  also  a master-slave  one, 
with  the  SMC  assuming  the  master  role.  Whenever  the  SMC  executes  a retrieve 
command  Issued  by  the  DBCCP,  the  response  of  that  command  is  sent  to  the  SMIP 
as  soon  as  the  SMIP  is  ready  to  accept  the  data.  Data  transfer  is  always 
from  the  SMC  to  the  SMTP  in  burst  mode.  When  the  reset  command  is  issued  by 
the  DBCCP,  the  SMC  informs  that  the  current  string  of  data  is  completed,  and 
that  the  SMIP  should  send  all  valid  data  units  to  the  index  translation  unit 
(IXU).  After  retrieval,  the  SMIP  clears  Its  memory  and  readies  itself  for  the 
next  data  string. 

3.4  The  Look-Aside  Buffer  Memory  (LABM) 

Before  describing  the  LABM,  we  Introduce  a few  notations  which  will  be 
used  rather  frequently  in  the  ensuing  discussion.  A command  received  by  the 
SMC  from  the  DBCCP  will  be  called  an  input  command,  while  a command  already 
present  in  the  LABM  will  be  called  a buffer  command.  A command  has  up  to  three 
arguments:  a transformed  keyword  value  T,  an  index  term  I,  and  a count  C. 

The  subscript  1 will  be  used  to  Identify  Input  command  arguments  and  the 
subscript  b will  be  used  to  Identify  buffer  command  arguments. 

3.4.1  Design  and  Implementation  of  the  LABM 

The  look-aside  buffer  memory  is  divided  Into  two  components:  an  associative 

memory  (AM)  and  a random  access  memory  (RAM).  The  AM  has  n data  cells  each 
of  which  has  a 24-blt  search  key  and  a pointer  to  the  random  access  memory.  The 
search  key  is  the  logical  bucket  name  specified  In  the  transformed  keyword  value 
of  one  or  more  buffer  commands.  The  pointer  directs  to  a list  of  commands  In  the 
RAM  whose  transformed  keyword  value  arguments  are  all  referring  to  the  same 
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loglcal  bucket  name  In  the  corresponding  search  key.  (See  Figure  23a).  The 
AM  is  searched  on  the  basis  of  a logical  bucket  name  occurring  in  T^.  The 
response  set  of  the  AM  contains  the  pointers  to  those  data  cells  whose  search 
keys  satisfy  the  search  criterion.  The  search  criterion  may  be 
'<*,  or  '<’.  The  RAM  which  holds  the  command  is  divided  into  m entries  each 
of  which  can  hold  one  command.  Each  entry  has  six  fields  (see  Figure  23b). 

FIELD  1:  List  Pointer  - This  field  points  to  the  next  command  entry  (in  the 

RAM)  whose  transformed  keyword  value  refers  to  the  same  logical  bucket  name 
as  the  command  in  the  current  entry.  This  field  is  [log2m]  bits  long. 

FIELD  2:  Queue  Pointer  - This  field  is  used  to  maintain  a FIFO  discipline 

among  all  the  commands  awaiting  execution  in  the  LABM.  The  field  points  to 
the  update  command  which  arrived  next  in  the  time  sequence.  It  requires 
[log2m]  bits. 

FIELD  3:  Order  Code  - This  field  is  used  to  indicate  whether  the  command  is 
an  insert  or  a delete  command  and  is  one  bit  long.  [Order  Code  - 1 for 
insertion;  ■ 0 for  deletion. ] 

FIELD  4:  Transformation  Value  - This  field  holds  the  48  bits  of  the  transformed 

value  associated  with  the  command. 

FIELD  5:  Index  Terms  - This  field  holds  the  28-blts  index  term  which  is  to  be 

deleted  or  inserted. 

FIELD  6:  Count  - This  field  is  used  to  indicate  the  number  of  security  (or 

clustering)  keywords  in  the  security  atom  (or  cluster)  identified  in  the  index 
term  in  Field  5. 

3.4.2  Effects  of  the  LABM  on  SMC  Command  Execution 

Algorithms  executed  by  SMC  ere  affected  by  the  presence  of  the  LABM. 

A.  Delete  Commends  (See  Algorithm  D in  Section  3.3.2)  - VIhen  an  delete 
cosauind  is  received  by  the  SMC,  the  SMC  orders  the  AM  of  the  LABM  to  search  for 
the  logical  bucket  name  contained  in  T^.  If  the  response  set  is  empty,  then 
a commend  entry  is  crssCed  for  the  commend  in  the  RAM  end  a data  cell  is 
created  in  the  AM  for  the  logical  bucket  name  in  T^.  The  pointer  in  the  data 
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Bucket  into 
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Associati¥9  Memory  (AM)  (Random  Access  Memory  (RAM) 

Figure  23a.  Organization  of  Look-Aside  Buffer  Memory  (LABM) 
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Figure  23b.  Format  of  tha  Connsand  Entry 
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cell  Is  set  to  point  to  the  conunand  entry  Just  created.  If  the  response  set 
contains  a pointer,  then  the  corresponding  list  of  commands  is  scanned.  If 
the  list  contains  a conmand  whose  order  code  Is  'Insert'  and  the  Index  term 
I^  Is  the  same  as  I^,  then  the  command  entry  Is  deleted  and  the  new 
command  Is  not  placed  In  the  LABM. 

B.  Insert  Comoands  (See  Algorithm  C In  Section  3.3.2}  - When  an  Insert 

command  Is  received  by  the  SMC,  the  AM  Is  ordered  to  search  for  the  logical 

bucket  name  contained  If  the  response  set  Is  empty,  then  a command 

entry  Is  created  in  RAM  for  the  command  and  a data  cell  Is  created  In  the  AM 

for  the  logical  bucket  name  in  T^.  The  pointer  In  the  data  cell  Is  set  to 

point  to  the  command  entry  Just  created.  If  the  response  set  contains  a 

pointer,  then  the  corresponding  list  of  commands  Is  scanned.  If  the  list 

contains  a command  whose  order  code  Is  'delete'  and  the  index  term  I.  Is  the 

D 

same  as  I^,  then  the  command  entry  Is  deleted  and  the  new  command  Is  not  placed 
In  the  LABM. 

C.  Retrieve  Commands  (See  Algorithms  A and  B In  Section  3.3.2)  - When  a 
retrieve  command  Is  received  by  the  SMC,  then  the  AM  Is  ordered  to  search 
for  those  logical  bucket  names  that  satisfy  the  command's  predicate  with 
respect  to  the  logical  bucket  name  In  T^.  If  the  response  set  Is  not  empty, 
then  the  lists  pointed  to  by  the  pointers  In  the  response  set  are  scanned  for 
'Insert'  commands.  The  Index  terms  I^  of  such  commands  are  placed  In  the 
retrieval  set.  Either  Algorithm  A or  B In  Section  3.3.2  Is  then  executed.  As 
a result  of  the  execution,  the  retrieval  set  Is  augmented.  At  this  point,  the 
AM  is  again  queried  for  logical  bucket  names  satisfying  the  retrieval 
predicate.  The  lists  (if  any)  are  scanned  for  'delete'  commands.  The  Index 
terms  (I^)  of  such  commands  are  removed  from  the  retrieval  set.  If  they 
occur  In  the  retrieval  set. 

The  AM  has  a very  small  number  of  data  cells,  typically,  under  64.  The 
exact  number  will  be  determined  by  the  performance  lnq>rovement  achieved  for  a 
certain  number  of  data  cells  In  relation  to  the  cost  of  Implementing  the  full 
associativity.  The  number  of  command  entries  that  can  be  stored  In  the  RAM 
should  be  larger  chan  the  number  of  data  cells  In  the  AM  by  a factor  governed 
by  the  average  number  of  commands  per  logical  bucket  name.  In  many  Instances, 
we  do  not  expect  retrieval  sets  to  be  affected  by  commands  In  the  LABM.  This 
Implies  that  Che  response  set  of  the  AM  will  be  emtpy  for  many  search  orders. 
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Thus,  the  availability  of  an  AM  enables  us  to  avoid  fruitless  scans  of  the 
command  queue  for  each  and  every  retrieval  command.  In  the  absence  of  an  AM 
such  scans  become  Inevitable  and  may  result  In  performance  degradation. 

VHien  the  AM  Is  full,  (l.e.,  no  data  cells  are  vacant),  the  LABM  Is 
considered  full,  although  there  may  be  a few  vacant  command  entries  In  the 
RAM.  Such  a condition  forces  the  SMC  to  Initiate  execution  of  the  pending 
update  commands  on  a FIFO  basis.  Each  of  the  commands  will  require  the 
execution  of  either  Algorithm  C or  D In  Section  3.3.2.  The  execution  of 
LABM  commands  may  also  take  place  If  the  SMC  has  not  received  a retrieve 
command  for  a length  of  time  known  as  the  time-out  period. 
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4.  THE  STRUCTURE  MEMORY  INFORMATION  PROCESSOR  (SNIP) 

This  section  prsssnts  the  part  of  tha  DEC  known  as  the  structure 
■anory  lofomtlon  processor  (SNIP).  The  SNIP  Is  a processor  which  performs 
Intersection  on  sets  of  Index  terms  provided  for  by  the  SM  of  the  DBC.  The 
architecture  of  SNIP  is  presented  in  three  stages:  First,  we  describe  the 
SNIP  as  a logical  machine.  Next  we  discuss  an  implementation  of  the  SNIP 
involving  hashing.  Finally,  the  physical  realization  of  the  algorithms  are 
discussed  in  some  depth. 


4.1  Logical  Description  of  the  SNIP 

As  shown  in  Figure  24,  the  SMTP  maintains  an  Intermediate  set  which  is 
subsequently  modified  by  the  argument  sets  In  the  course  of  intersection. 

The  argument  sets  are  supplied  by  the  SM  consisting  of  sets  of  index  terms. 

The  Intermediate  set  Is  designated  SW  and  consists  of  couples  (m,d)  called 
SNIP  data  units.  The  first  part  m of  the  couple  Is  called  the  key  and  the 
second  part  d is  called  the  data.  The  manner  In  which  this  intermediate 
set  Is  manipulated  Is  determined  by  the  state  of  the  SNIP  and  the  command 
received  from  the  SM. 

The  concept  of  a partitioned  content  addressable  memory  (PCAM),  used  to 
implement  the  SM,ls  also  used  here  to  realize  the  intermediate  set.  As  shown 
in  figure  24,  the  Intermediate  set  Is  partitioned  into  subsets,  each  of 
idilch  contains  one  or  more  data  units.  As  In  the  case  of  the  SM,  searching 
Is  the  most  Important  operation  carried  out  In  the  SNIP.  Index  terms  In  an 
argument  set  are  used  as  search  keys  to  determine  which  one  of  the  partitions 
has  to  be  searched  for  the  existence  of  a data  unit  with  a corresponding 
flutchlng  key. 

The  SNIP  can  be  In  one  of  three  states.  These  three  states  are  known 
as  the  toltlal,  the  active  and  the  retrieval  state.  The  SNIP  Is  in  the  Initial 
state  when  It  Is  ready  to  accept  the  first  argument  set  from  the  SM.  The  SNIP 
io  the  active  state  when  It  has  already  processed  one  or  more  argument  sets 
and  has  formed  an  intermediate  set.  The  SNIP  Is  In  the  retrieval  state  when 
the  last  of  the  argument  sets  has  beun  received  and  the  SNIP  Is  in  the  process 
of  retrieving  valid  data  units  and  sending  them  to  the  DCU.  We  shall  see  later 
what  we  mean  by  valid  data  units. 

There  are  two  kinds  of  SNIP  cnwnsnds.  The  first  kind  of  SNIP  conaand 
Is  represented  by  SMIP<m,g>  where  m Is  a key  and  g Is  a manipulation 
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Figure  24.  Conceptual  Model  of  SHIP 
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functlon.  The  menlpulatlon  function  can  do  one  of  two  things  depending  on  the 
■t*te  of  the  SHIP.  If  the  SHIP  Is  In  the  Initial  state,  then  g Is  Interpreted 
as  a "create"  function  which  creates  a SHIP  data  unit  for  m with  the  data  part 
d"l.  When  the  SHIP  Is  In  the  active  state,  g Is  Interpreted  as  a "replace" 
function  which  nay  modify  the  data  part  of  an  existing  SHIP  data  unit  with  key 
m under  certain  conditions  or  do  nothing  If  a SHIP  data  unit  with  key  m Is 
not  found.  The  second  kind  of  SHIP  command  Is  represented  by  SMIP<retrleve> 
which  signals  the  end  of  the  Intersection  operation  and  requests  the  SHIP  hard- 
ware to  retrieve  all  valid  data  units. 

The  SHIP  maintains  an  argument  set  count  (ASC)  which  represents  states 
and  governs  the  action  of  the  manipulation  function.  In  the  initial  state,  the 

ASC  has  a value  0.  Thus,  when  ASC  Is  zero  a conmiand  of  the  form  SMIP<m,g>  Is 

Interpreted  as  a "create"  function.  This  function  creates  In  the  SHIP  memory 
a data  unit  with  m as  the  key  and  (ASC  + 1)  as  the  data.  After  all  the  ele- 
ments of  the  first  argument  set  have  been  used  for  such  "creation",  the  SM 
sends  an  end-of-set  signal.  This  signal  Increments  ASC  by  1,  and  changes  the 
SMIP  state  to  active.  When  the  second  and  subsequent  argument  sets  are  received, 
g Is  Interpreted  as  a "replace"  function  as  follows:  Let  m be  the  key  of 
an  element  In  the  1-th  argument  set  (1>1).  The  SMIP  memory  Is  searched  for  a 
data  unit  with  m as  the  key.  If  It  Is  found  and  If  the  data  part  d Is 

equal  to  ASC,  then  d Is  replaced  by  (d  + 1).  If  the  search  does  not  succeed 

In  finding  a data  unit  with  m as  the  key,  then  no  action  Is  taken.  When  all 
the  elements  of  the  1-th  argument  set  have  been  processed  In  this  way,  ASC  Is 
Incremented  by  1 and  the  SMIP  remains  In  the  active  state  to  process  the  next 
argument  set.  After  all  arguments  sets  have  been  dispatched  by  the  SM,  the  SM 
sends  the  command  SMIP<retrleve> . This  command  changes  the  SMIP  state  to 
retrieve.  In  the  retrieve  state,  the  data  parts  of  the  data  units  In  the  SMIP 
memory  are  compared  to  the  value  of  ASC.  Those  data  units  whose  data  part  d 
equal  the  ASC  are  retrieved  and  sent  to  the  IXU.  Such  data  units  are  known  as 
valid  data  units.  After  the  retrieval  process  Is  completed,  the  SMIP  returns 
to  Its  Initial  state  by  setting  ASC  to  0. 

For  simplicity,  we  have  not  touched  upon  the  role  of  the  DFLAG  (first 
mentioned  In  Section  3.3.2)  which  accompanies  every  argument  set  except  the 
first.  We  shall  now  briefly  describe  the  effect  of  the  DFLAG.  When  the 
DFLAG  *0,  the  SMIP  behaves  exactly  as  reported  above.  However,  when  the 
DFLAG  " 1,  the  g function  acts  as  a delete  function.  The  keys  of  every 
element  in  an  argument  set  with  DFLAG  - 1,  are  used  to  search  the  SMIP  memory. 
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If  a data  unit  %d.th  a matching  key  Is  found,  It  Is  deleted;  if  It  Is  not  found, 
no  action  is  taken.  At  the  end  of  the  search,  the  ASC  Is  not  Incremented.  The 
need  for  the  DFLAG  arises  out  of  the  presence  of  negated  keyword  predicates  in 
queries . 

In  Figure  25,  we  have  reproduced  from  [3]  an  algorithm  which  performs  an 
N-set  Intersection  with  appropriate  comments  In  the  light  of  our  discussions. 

4.2  Implementation  Considerations 

In  carrying  out  set  Intersections,  the  most  frequently  used  operation  in 
the  SHIP  Is  the  search-and-manlpulate  operation.  Thus,  it  Is  obvious  that  In 
any  Implementation  of  the  SHIP  we  must  provide  for  an  efficient  and  rapid  search 
function.  An  Important  characteristic  of  this  search  function  is  that  It  always 
manipulates  at  most  one  SMIP  data  unit.  This  is  because  the  data  unit  to  be 
manipulated  Is  Identified  by  a unique  key  which  forms  a part  of  the  data  unit. 
This  characteristic  gives  us  a clue  about  a possible  Implementation  technique, 
namely,  hashing.  Hashing  as  a search  technique  Is  effective  only  when  the 
criterion  for  success  is  an  exact  match  between  a "search"  key  and  a part  of  a 
data  unit  stored  In  the  SMIP.  Hashing  Is  effective  because  It  partitions  the 
set  of  all  possible  search  keys  Into  equivalence  classes  by  a relation  called 
the  hash  function.  Each  of  these  partitions  Is  implemented  by  a bucket  memory. 

In  order  to  locate  a data  unit  with  a key  m,  it  Is  enough  to  search  the  bucket 
memory  Into  which  the  key  Is  hashed.  Instead  of  searching  the  entire  SMIP  memory. 
A system  of  bucket  memories  Is  Implemented  using  the  concept  of  PCAM  that  we 
used  to  Implement  the  SM.  In  a PC\M-lmplemented  SMIP  each  bucket  Is  identified 
by  a memory  unit-processor  pair.  There  are  M such  pairs.  The  Intermediate 
set  SW  Is  partltio.ied  Into  K subsets  designated  SUB^,  where  K < M.  The  command 
SMIP<m,g>  Is  executed  by  ordering  all  the  K memory  unit-processor  pairs  to 
retrieve  valid  data  units.  It  should  be  noted  that  during  the  processing  of 
an  argument  set,  all  the  memory  unit-processor  pairs  will  be  simultaneously 
manipulating  data  units  corresponding  to  different  search  keys  associated  with 
different  elements  in  the  argument  set.  The  size  of  the  memory  units  and  the 
number  of  pairs  are  dictated  by  performance  requirements. 

It  may  be  interesting  to  point  out  some  of  the  differences  between  the 
manner  in  which  PCAMs  are  employed  in  the  SM  and  the  manner  in  which  PCAMs  are 
used  in  the  SMIP.  In  the  SM,  a partition  of  the  PCAM  is  implemented  by  allo- 
cating one  or  more  modules  in  one  or  more  memory  units.  In  the  SMIP  a partition 
is  usually  completely  contained  in  a single  memory  unit.  [The  case  when 


-88- 


1 

2 

3 

4 

5 

6 

7 

8 
9 

10 

11 

12 

13 

14 

15 

16 

17 

18 


Argument  Sets  X^,  X2»  ....  X^. 
begin 

for  each  of  Xj^  do; 

begin 

execute  the  command  SMIP<create, 

Conanent;  at  this  point  ASC  - 0,  SHIP  state  * Initial  state 
end 

Comment;  ASC  ASC  + 1,  thus  SHIP  state  - active  state; 

for  j ■ 2,  3,  4 N do; 

begin 

for  each  element  Xj^  of  X^  do; 
begin 

execute  SMIP<replace, 

Comment ; ASC  ■ j-1,  state  “ active; 
end 

ASC  ASC  + 1; 

end 

Execute  the  SHIP  command  SMIP<retrleve> ; 

Comment ; ASC  ■ n,  state  ■ retrieve; 


Figure  25:  A N-set  Intersection  Algorithm 


bucket  overflow  occurs  will  be  discussed  later  as  an  exceptional  case.] 

Because  of  the  above  difference,  the  type  of  parallelism  available  In  the 
two  components  Is  different.  In  the  SM,  all  PEs  (except  those  that  are 
masked)  carry  out  the  same  operation  on  the  same  bucket  (partition) . In 
the  SHIP,  the  same  operation  could  be  performed  concurrently  on  several 
buckets  (partitions),  although  all  the  operations  would  pertain  to  the  same 
argument  set.  The  memory  unit-processing  element  pairs  of  the  SM  are  synchro- 
nized In  the  execution  of  an  order;  this  Is  not  the  case  In  the  SHIP.  Each 
memory  unit-processor  pair  has  a variable  number  of  searches  to  be  performed 
and  are  not  synchronized.  The  variable  number  of  searches  Is  due  to  the  fact 
that  any  hashing  function,  however  good,  will  exhibit  local  nonrandomness 
leading  to  more  searches  In  some  buckets  than  In  other  buckets. 

Why  did  we  choose  to  Implement  PCAMs  In  two  different  ways  In  two 
components  which  apparently  have  the  same  major  function,  namely,  searching? 

Before  answering  this  question,  we  note  parenthetically,  that  If  we  were  given 
a choice,  we  would  Implement  the  SM  PCAM  In  a manner  similar  to  that  of  the 
SMTP  PCAM,  l.e.  an  Implementation  Involving  concurrent  searching  of  several 
partitions.  However,  we  cannot  do  this  In  the  case  of  the  SM  because  the  SM 
search  space  Is  much  larger  than  the  SMTP  search  space.  In  fact  It  Is  easy  to 
show  that  the  entire  SMIP  search  space  Is  never  greater  than  the  largest  parti- 
tion of  the  SM.  As  a result  of  this  size  disparity,  the  technology  applicable 
to  the  SMIP  Implementation  Is  not  applicable  to  the  SM  Implementation.  Whereas 
In  the  SMIP,  we  can  use  random  access  memories  for  each  of  the  partitions.  It 
Is  not  cost-effective  to  use  random  access  memories  for  the  SM.  Because  of 
such  a constraint,  sequential  or  quasi-random  access  memories  have  to  be  used 
for  the  SM.  The  access  times  In  such  memories  Is  a significant  part  of  the 
overall  retrieval  time  (l.e.  access  time  + readout  time).  Now,  a partition  of 
SM  Is  likely  to  be  made  up  of  several  accessible  units.  Therefore,  we  can  gain 
In  performance  by  distributing  these  units  over  several  of  the  processing  elements 
and  allowing  them  to  concurrently  access  the  units  belonging  to  the  same  parti- 
tion. Such  a scheme  will  ensure  that  In  the  majority  of  Instances  the  processing 
time  will  be  no  greater  than  one  access  time  plus  one  readout  time.  Now,  consider 
searching  of  several  SM  partitions  concurrently.  Since  partitions  may  be  con- 
tained entirely  within  the  memory  space  attached  to  a single  processing  element. 

If  there  ere  two  requests  to  two  partitions  In  the  same  memory  unit,  then  eccess 
to  the  partition  corresponding  to  the  second  request  will  have  to  wait  until  the 
first  request  Is  completely  process  by  the  processing  element.  This  may  Involve 
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several  sequential  accesses  to  the  memory  unit  attached  to  the  processing 
element.  Thus,  we  conclude,  that  for  any  sequence  of  requests  that  does  not 
distribute  uniformly  over  the  partitions  contained  In  the  memory  spaces  of 
processing  elements,  concurrent  processing  of  partitions  consumes  more  time 
than  concurrent  processing  of  a single  partition.  In  practice,  we  cannot 
ensure  such  uniformity  of  requests,  and,  therefore.  It  Is  advantageous  to 
search  a single  partition  at  a time  by  a group  of  processing  elements. 

4.3  Physical  Organization  of  the  SMTP 

The  architecture  of  the  SHIP  la  shown  In  Figure  Z6.  The  SHIP  controller 
(SMIPC)  manages  the  operation  of  the  structure  memory  Information  processing 
elements  (SMIPEs)  via  the  SNIP  control  and  data  bus  (SMIPCDB) . Each  SNIPE 
normally  realizes  one  bucket  of  a hash  table.  The  common  memory  bus  (CNB) 
allows  one  SNIPE  to  use  another’s  memory  In  case  of  bucket  overflows  In  the 
former. 


4.3.1  The  SNIP  Controller 

The  SNIP  controller  Is  responsible  for  the  following  functions: 
'Nalntalnlng  the  value  of  the  argument  set  count  (ASC) 

•Hashing  the  elements  of  an  argument  set  and  determining  the 
bucket  to  which  the  element  belongs 

'Controlling  and  Issuing  orders  to  the  SNIP  processing 
elements 

•Interpreting  the  orders  from  the  SN. 

The  maintenance  of  the  ASC  Is  an  Important  function  since  It  determines 
the  command  type.  The  ASC  Is  an  eight -bit  counter  which  Is  set  to  zero  or 
increawnted  by  the  algorithms  described  later  in  this  section. 

The  hash  function  applied  to  the  elements  of  the  argument  set  in  order 
to  determine  the  bucket  to  be  searched  la  simple  and  straightforward.  Recall 
that  the  elements  In  the  argument  set  are  Index  terms  retrieved  by  the  BNS 
of  the  SN.  Each  Index  term  Is  28  bits  long  and  Is  made  up  of  3 segments: 
an  MAO  address  (8  bits),  a cluster  Identifier  (10  bits),  and  a security 
atom  name  (10  bits).  The  result  of  applying  the  hash  function  on  this  28 
bit  index  term  Is  to  produce  an  n bit  number  where  2^  Is  the  nuidier  of 
MO  - PE  pairs  In  the  SNIP.  A simple  hash  function  to  do  this  would  con- 
catenate [j]  low  order  bits  of  the  MAO  address  with  [j]  low  order 
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Flgurs  26.  The  Architecture  of  the  SHIP 
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blts  of  the  cluster  name  and  (n-[^]  - [j])  low  order  bits  of  the  security  atom 
name  as  shown  In  Figure  27.  Of  course  there  are  several  equally  effective 
hash  functions  chat  could  be  used.  The  purpose  of  building  one  such  function 
was  to  demonstrate  the  relative  ease  of  creating  hash  functions  for  our 
application.  Such  function  can  be  microprogrammed  Into  the  SHIP  controller. 

The  n bit  result  of  the  hash  function  enables  the  controller  to 
Identify  the  bucket  to  be  searched  for  a match.  However,  It  Is  quite 
conceivable  that  the  search  time  within  a bucket  Is  substantially  longer  than 
the  time  taken  to  produce  the  hash  function.  This  observation  Implies  that  a 
speed  mismatch  Is  likely  to  develop  between  the  controller  and  the  processing 
elements.  Therefore,  the  controller  maintains  a buffer  (whose  size  depends 
on  the  severity  of  the  mismatch)  for  each  processing  element  In  the  SHIP. 

After  all  the  elements  of  an  argument  set  have  been  transmitted  by  the  SM  and 
the  SHIP  controller  has  placed  them  In  their  respective  buffer,  the  controller 
executes  the  following  algorithm 


POLLING  ALGORITHM 


Step  0: 
Step  1: 

Step  2: 
Step  3: 

Step  4: 


Step  5: 


Broadcast  'search*  signal  to  all  processing  elements. 
Braodcast  value  of  ASC  and  DFLAG  (supplied  by  SM)  to  all 
processing  elements.  Set  EMPTY [1]»0  for  all  1. 

1 1 

If  the  1-th  processing  element  Is  not  busy,  then  go  to  Step 
4;  else,  go  to  Step  5 

If  the  1-th  buffer  In  the  controller  memory  Is  non-empty, 
then  transmit  a search  key  from  the  1-th  buffer  to  the  i-th 
processing  eleisent.  If  the  1-th  buffer  Is  empty,  set 
EMPTYfl]  to  1. 

1 1 4-  1.  If  1 < N,  then  go  to  Step  3.  If  1 > N and 

EMPTY[J]*1,  for  all  J 5 N,  then  go  to  Step  6;  else,  go 
to  Step  2. 

ASC  *■  ASC  -f  1.  Terminate. 


Step  6:  ASC  *■  ASC  -f  1.  Terminate. 

Note:  In  the  above  algorithm  DfPTY  Is  an  array  of  N bits  where  N Is 

the  number  of  processing  elements  In  the  SHIP.  Each  bit  Is  used 
to  Indicate  If  the  corresponding  buffer  Is  empty. 


When  the  SM  Indicates  to  the  SHIP  controller  that  all  the  argument  sets  have 
been  transmitted,  the  SHIP  controller  executes  the  following  algorithm 


RETRIEVAL  ALGORITHM 

Step  0:  Broadcast  'retrieve*  signal  to  all  processing  elements. 

Step  1:  Broadcast  value  of  ASC  to  all  processing  elements. 

Step  2:  Clear  buffer  SMmory  In  order  to  receive  data  units  from  the 

processing  elements. 

Step  3:  For  all  1,  set  B0DFLAG[i]  to  0. 

Step  4i  1 1. 
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Step  5:  If  the  1-th  processing  element  has  end-of-data  flag  on,  go  to 

Step  9. 

Step  6:  If  the  1-th  processing  element  Is  ready  to  send  a data  unit, 

then  go  to  Step  7;  else,  go  to  Step  8. 

Step  7:  Place  the  data  unit  from  the  1-th  processing  element  In  the 

buffer  memory. 

Step  8:  1 1 + 1.  If  1 < N go  to  Step  5.  If  1 > N and  If  for 

all  j EODFLAG[j  ]*■!,  then  go  to  Step  10;  else,  go  to  Step  4. 

Step  9:  E0DFLAG[1]  -*-1.  Go  to  Step  8. 

Step  10:  Transmit  contents  of  buffer  memory  to  IXU.  Send  'clear' 

signal  to  all  processing  elements.  Set  ASC  to  0 and  clear 
buffer  memory.  Terminate. 

Note:  In  the  above  algorithm  EODFLAG  Is  an  array  of  N bits,  each  of 

which  Is  used  to  Indicate  If  the  corresponding  processing  element 
has  sent  all  the  valid  data  units  to  the  controller. 


;i 


The  SHIP  controller  Is  essentially  a polling  processor  which  monitors  the 
activities  of  the  processing  elements.  Its  other  important  function  Is  to 
Interface  with  the  SM  and  the  IXU.  Both  these  functions  can  be  conveniently 
Implemented  Into  microprogrammed  controller.  Thus,  the  architecture  of  the 
controller  does  not  call  for  any  sophistication  beyond  what  is  already 
available  in  the  commercial  market.  The  buffer  memory  attached  to  the 
controller  should  be  large  enough  to  hold  either  an  argument  set  (during 
Intersection)  or  the  total  number  of  valid  data  units  while  retrieval.  We 
expect  the  size  of  this  s^emory  to  be  small  enough  to  employ  semiconductor 
RAMS. 


:i 


4.3.2  The  Processing  Element  and  Memory  Unit  Pairs 

Each  processing  element  has  two  main  functions:  First,  It  maintains 

the  memory  unit  associated  with  It.  By  this  we  mean  that  the  processing 
element  Is  capable  of  adding,  deleting,  manipulating  and  retrieving  data 
units  In  Its  memory  In  an  efficient  manner.  Second,  it  Interfaces  with  the 
SMIP  controller.  In  this  section,  we  shall  address  these  two  functions  In 
some  detail. 

Of  the  four  operations  that  need  to  be  performed  on  the  date  units  In  the 
memory  unit  of  the  processing  element,  the  stost  frequent  operation  is  the  aumlp- 
uletion  function.  The  data  unit  to  be  manipulated.  Is  Identified  by  e search  key 
supplied  by  the  SHIP  controller.  Such  Identification  Is  facilitated  by 
once  again  employing  e hashing  strategy.  The  hash  function  employed  within 
the  memory  unit  of  the  proceaslng  element  should  not  be  confused  with  the 
hash  function  employed  by  the  SMIP  controller  to  Identify  the  processing 
alament-mesnry  unit  pair.  The  memory  unit  attached  to  a processing  element 
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is  divided  into  two  parts  - a bucket  index  and  a data  memory.  In  this  organi- 
zation, all  data  units  whose  search  key  hashed  to  the  same  bucket  name  are 
placed  in  a linked  list.  The  origin  of  the  linked  list  is  to  be  found  in 
the  bucket  index,  while  the  list  Itself  is  placed  in  the  data  unit  memory,  as 
Illustrated  in  Figure  28.  The  hash  function  merely  uses  the  high  order  n 
bits  of  the  search  key  to  index  into  the  bucket  index.  The  size  of  n depends 
on  the  relative  size  of  the  bucket  index  with  respect  to  the  size  of  the  data 
memory.  Each  entry  in  the  data  memory  has  three  fields:  The  first  field  has 

(28-n)  bits  of  the  search  key;  the  second  field  is  a count  field;  and  the  last 
field  is  a pointer  to  the  next  entry  in  the  linked  list.  The  first  and  second 
field  together  constitute  a data  unit. 

In  our  earlier  discussion  in  Section  4.1,  we  had  indicated  that  the  SHIP 
has  three  logical  states.  Corresponding  to  these  three  states,  the  processing 
element  has  three  modes.  These  modes  are  set  by  the  SHIP  controller  through 
control  signals  in  the  SMIPCDB.  The  three  modes  are  called  clear,  search  and 
retrieve  modes.  In  the  clear  mode,  Che  bucket  index  is  cleared,  thus 
effectively  losing  the  data  in  the  data  memory.  The  processing  element  enters 
the  clear  mode  at  the  receipt  of  a 'clear'  signal  from  the  SHIP  controller. 

In  the  search  mode,  a search  key  is  supplied  by  the  SHIP  controller  along 
with  the  value  of  ASC  and  DFIAG.  If  the  ASC  value  is  zero,  Chen  an  entry 
in  Che  data  memory  is  created  for  the  search  key  with  count  field  sec  to  1. 

If  the  value  of  ASC  is  greater  than  zero  and  if  DFLAG  is  zero,  then  the  search 
key  is  used  to  search  the  data  memory  for  a matching  entry.  If  it  is  found 
and  if  Che  count  field  equals  the  ASC  value,  Chen  count  field  is  Incremented 
by  one.  If  the  matching  entry  is  not  found,  or  if  the  count  field  is  not 
equal  to  the  ASC  value,  no  action  is  taken.  If  the  value  of  ASC  is  greater 
than  zero  and  if  DFLAG  is  1,  Chen  Che  matching  entry  (if  found)  in  the  data 
memory  is  deleted.  The  processing  element  enters  the  search  mode  upon  receipt 
of  a 'search*  control  signal  from  the  SHIP  controller.  In  the  retrieve  mode, 
the  processing  element  systematically  traverses  the  lists  originating  from  the 
bucket  index  and  transmits  all  those  data  units  whose  count  field  equals  the 
ASC  value  supplied  by  the  SHIP  controller.  The  processing  element  enters  the 
retrieve  mode  upon  rscsipc  of  the  'retrieve'  control  signal. 

There  may  be  occeesione  whan  a meswry  unit  associated  with  a processing 
element  becooMS  full  and  cannot  accomodate  any  more  data  units.  Such  a 
condition  will  occur  only  during  the  execution  of  the  create  function 
(ASC«0).  When  such  a eondition  occurs,  the  concerned  processing  clement 
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•ttempts  to  obtain  space  from  some  ocher  memory  units  via  the  coomton  memory 
bus.  The  comaon  memory  bus  (CMB)  shorn  in  Figure  25  can  be  used  to  access 
Che  contents  of  any  of  the  memory  units.  The  prcoessing  element  which 
wishes  to  access  its  neighbor's  memory  unit  must  first  obtain  control  of  Che 
bus.  It  can  then  address  the  mesnry  unit  of  another  processing  element  for 
reading  or  writing.  All  memory  units  have  two  I/O  parts  - one  services 
requests  from  Che  processing  clement  Co  which  the  memory  unit  is  attached,  and 
the  second  to  service  requests  from  the  common  bus.  Before  a processing 
element  can  use  its  neighbor's  memory  unit,  it  must  make  sure  that  space  is 
available  in  the  memory  unit.  This  is  done  by  examining  Che  allocation 
register  (see  Figure  28).  If  the  'memory  full'  indicator  is  on,  the  the 
processing  element  has  to  try  another  memory  unit  for  space.  If  the  'memory 
full'  Indicator  is  off,  Chen  the  processing  element  reads  off  the  contents 
of  the  allocation  register  and  increments  it  by  1.  The  testing,  reading  and 
increment  are  carried  out  by  a single  indivisible  instruction.  Indivisibility 
of  Che  above  operation  is  important  since,  the  local  processing  element  should 
not  be  allowed  to  access  or  manipulate  the  allocation  register  while  another 
processing  element  is  accessing  the  same  register  via  the  CMB.  Once  the 
space  for  data  unit  has  been  allocated  to  a processing  element,  it  can  be 
manipulated  in  exactly  the  same  manner  as  a local  data  unit.  The  processing 
element  must  remember  the  address  of  the  memory  unit  from  which  it  has 
borrowed  the  space.  This  is  done  by  incorporating  extra  bits  in  the  pointer 
field  of  a data  unit  in  the  data  memory. 

This  completes  our  discussion  on  the  logic  of  the  processing  element. 

As  can  be  observed,  the  logic  is  fairly  sinq>le  and  can  be  implemented  on  a 
microprogrammed  sdcroprocessor.  The  memory  requirements  can  be  estimated 
only  when  we  have  an  idea  of  the  performance  requlreswnts  of  the  SNIP.  It  is 
estimated  that  the  requiraments  will  be  small  anough  to  allow  ua  to  use  random 
aecesf  mamoriee  for  memory  units  attached  to  each  of  the  processing  elements. 


m 
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5.  THE  INDEX  TRANSLATION  UNIT  (IXU) 

As  a component  of  the  DBG,  the  Index  translation  unit  (IXU)  Is  responsible 
for  translating  an  Index  term  received  from  the  SHIP  Into  an  absolute  HAU 
address,  a cluster  Identifier,  and  a security  atom  name.  It  Is  also  responsible 
for  allocating  and  releasing  cluster  Identifier  and  security  atom  names  as 
demanded  by  the  DBCCP.  In  the  ensuing  discussion,  ve  first  present  a 
rationale  for  having  a separate  unit  to  realize  the  above  responsibilities. 

He  next  describe  the  data  structures.  We  then  describe  the  algorithms  executed 
by  the  IXU.  Finally,  we  estimate  the  hardware  capabilities  needed  to  realize 
the  IXU. 


5.1  The  Need  for  an  IXU 

We  learned  that  the  KXU  was  specialized  because  of  the  need  to  produce 
transformed  values  of  keywords  efficiently.  Also,  the  SM  and  SHIP 
hardware  were  designed  to  be  efficient  search  engines.  We  may,  perhaps, 
contemplate  on  Incorporating  the  functions  of  the  IXU  among  the  three 
components  already  mentioned.  Consider  consolidating  the  functions  of  the 
IXU  with  those  of  the  SM.  We  recall  that  the  SM  retrieves  a large  number 
of  Index  terms  which  will  ultimately  be  rejected  by  the  SHIP  because  only  a 
few  Index  terms  will  survive  the  Intersection  operation.  By  prematurely 
translating  all  the  Index  terms,  the  SM  Is  performing  work  that  Is  not  really 
needed.  Furthermore,  by  translating  the  Index  terms  In  the  SM,  we  Imply  that 
SMIP  must  recognize  and  maintain  three  different  data  units  corresponding  to 
the  three  consonants  of  an  Index  term  Instead  of  a single  data  unit.  This 
In  general  would  be  reflected  In  more  complex  logic  In  SMIP.  Thus, 
consolidating  the  IXU  with  the  SM  Is  not  desirable.  Next,  consider  consolida- 
ting the  function  of  the  IXU  with  the  SMIP.  In  this  case.  In  order  to  maintain 
concurrent  Intersection  and  translation  operation  a separate  processor  will 
have  to  be  built  Into  the  SMIP  to  take  care  of  the  translation  logic,  while 
the  SHIP  controller  Is  polling  the  processing  elements  of  the  SMIP  for 
Intersection.  Such  a decision  essentially  recognizes  the  Independence  of  the 
IXD.  Lastly  we  nay  consider  consolidating  the  functions  of  the  IXU  with  those 
of  the  DBCCP.  He  shall  see  In  [8]  that  the  DBCCP  la  primarily  designed  to 
achieve  concurrency  among  separate  components  of  the  DBC.  By  making  the  IXU 
as  a separate  unit,  additional  concurrency  can  be  achieved  by  the  DBCCP.  Thus, 
the  four  components,  namely  KXU,  SM,  SMIP  and  IXD,  form  a structure  loop  (see 
Figure  1)  which  serves  as  a pipeline  of  structural  information  of  the  database. 
He  shall  have  more  to  say  about  this  when  we  discuss  the  DBCCP  in  Part  III. 
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5.2  Data  Structures  in  the  IXU 

Each  file  known  to  the  DBC  (i.e.,  a file  that  Is  stored  In  the  MM)  has  an 
MAU  address  table  (MAUAT)  maintained  by  the  IXU.  A MAUAT  has  up  to  256 
entries  as  shown  In  Figure  29.  Each  entry  Is  two  bytes  long  and  contains  the 
address  of  an  MAU.  A MAU  address  need  be  no  longer  than  16  bits,  since  the 
MM  has  less  than  2^^  MAUs.  [This  Is  true  because  we  are  designing  a DBC  to 
support  a 10^^  byte  database  and  an  MAU  can  typically  hold  no  less  than 
2 X 10^  bytes.  This  means  there  are  no  more  than  50,000  MAUs  in  the  MM]. 
Thus  a MAUAT  occupies  512  bytes.  Associated  with  each  MAUAT,  Is  an  MAU  bit 
map  (MAUBM)  which  Indicates  which  (If  any)  of  the  256  entries  In  the  MAUAT 
Is  (are)  free  for  allocation.  The  bit  map  occupies  32  bytes  («  256  bits). 

As  we  shall  see  In  the  next  section,  these  two  pieces  of  data  structure  are 
used  In  the  translation  of  MAU  numbers  (In  the  Index  term)  Into  absolute  MAU 
addresses  and  In  the  maintenance  of  MAU  numbers. 

For  each  file,  the  IXU  also  maintains  a cluster  identifier  bit  map 
(CIBM)  and  a security  atom  name  bit  amp  (SANBM) . These  bit  maps  are  used 
to  keep  track  of  the  allocation  and  release  of  cluster  Identifiers  and 
security  atom  names.  Each  bit  map  occupies  1024  bits  to  keep  track  of  2^^ 
cluster  names  or  security  atom  names.  Thus  the  total  space  devoted  to  each 
file  Is  given  by  800  (-  512  + 32  128  128)  bytes. 

Under  normal  circumstances,  we  do  not  expect  more  than  2000  files  to 
be  existing  within  the  DBC.  Thus,  the  total  memory  capacity  required  Is 
of  the  order  of  1.6  M bytes.  Fortunately,  as  we  shall  see  In  the  next 
section,  not  all  of  the  data  structure  need  be  accessible  at  any  given 
time.  In  fact,  since  the  Index  terms  transmitted  by  the  SMIP  In  any 
period  of  time  corresponds  to  a particular  file.  It  Is  enough  If  the  IXU 
can  have  the  data  structure  of  that  file  In  rapid  access  memory.  The 
remaining  data  structures  for  other  files  may  be  In  slower  memory.  We 
thcrfore  propose  that  there  be  two  types  of  memory  associated  with  the 
IXU  - a relatively  smII  and  fast  memory  (•■  1000  bytes,  Implemented  with 
semiconductor  RAMs)  for  holding  data  structures  of  the  current  file  and  a 
large  and  slower  meiaory  (about  1.6  M bytes  Implemented  with  a fixed-head  disk 
or  equivalent)  for  holding  data  structures  of  other  files. 


5.3  Algorithms  Executed  by  the  IXU 

The  IXU  is  controlled  by  the  DBCCP,  and  responds  to  coimsands  from  the 
OBCCP.  The  consMnds  fall  into  two  categories.  The  first  category  of  consMnds 
requirea  Input  data  (l.a. , indax  tarma)  from  the  SMIP  and  results  In  the 
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translatlon  of  the  Index  term  into  MAU  addresses,  cluster  Identifiers  and 
security  atom  numbers.  The  second  category  of  commands  Is  used  by  the  DBCCP 
to  allocate  or  release  MAUs,  cluster  Identifiers  and  security  atom  names. 

In  the  algorithms  presented  below.  Algorithms  A,  B,  C and  D are  executed  In 
response  to  the  first  category  of  commands,  while  the  remaining  algorithms 
are  executed  In  response  to  commands  of  the  second  category. 


ALGORITHM  A:  Extract  from  the  next  set  of  Index  terms  the  MAU  addresses 

with  the  help  of  MAUAT  of  file  F. 


Input  Arguments: 


1.  Index  Terms  from  SMIP 

2.  File  ID,  F,  from  DBCCP 


Step  1:  Load  the  MAUAT  for  the  file  F from  the  secondary  memory  Into 

the  primary  memory. 

Step  2:  Walt  for  index  terms  from  SMIP. 

Step  3:  For  each  Index  term  from  SMIP,  do  Step  A. 

Step  4:  Extract  MAU  number  (high  order  8 bits)  from  the 
Use  the  MAU  number  as  an  Index  Into  the  MAUAT. 

MAU  address  from  the  accessed  entry. 

Step  S:  Transmit  all  MAU  addresses  to  DBCCP.  Terminate. 


Index  term. 
Retrieve  the 


ALGORITHM  B:  Extract  from  the  next  set  of  index  terms,  the  cluster  identifier. 

Input  Arguments:  1.  Index  Terms  from  SMIP 


Step  1:  Walt  for  Index  term  from  SMIP. 

Step  2:  For  each  Index  tera  from  SKIP,  do  Step  3. 

Step  3:  Extract  the  bits  8 through  17  of  the  Index  term. 

Step  4:  Transmit  all  the  cluster  Identifiers  extracted  in  Step  3 to  DBCCP. 
Terminate. 


ALGORITHM  C:  Extract  from  the  next  set  of  index  term,  the  security  atom 

names. 

Input  Arguments:  1.  Index  Terms  from  SMIP 

Step  1:  Walt  for  Index  terms  from  SMIP. 

Step  2:  For  each  ten  from  SMIP,  do  Step  3. 

Step  3:  Extract  the  bits  18  through  27  of  the  index  term. 

Step  4:  Trensmlt  all  the  security  atom  names  extracted  In  Step  3 
to  DBCCP.  Terminate. 


I 

IJ 

II 


ALGORITHM  D:  Extract  from  the  next  set  of  Index  terms,  the  MAU  addresses, 

the  cluster  Identifiers  and  security  atom  names. 

Input  Arguments t 1.  Index  Terms  from  SMIP 
2.  File  ID,  F,  from  DBCCP 

Step  1:  Load  the  MAUAT  from  the  file  F from  the  secondary  memory  to 

the  primary  memory. 

Step  2:  Welt  for  the  Index  ten  from  SMIP. 

Step  3:  For  each  Index  ten  from  SMIP,  do  Step  4. 
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step  4:  Extract  the  MAU  number  (bits  0 through  7)  from  the  index  term. 
Use  the  NAU  number  as  an  index  into  the  MAUAT.  Retrieve  the 
HAU  address  from  the  accessed  entry. 

Step  5:  Transmit  the  MAU  address  and  the  20  low  order  bits  of  the 

corresponding  index  terms  to  the  DBCCP.  Terminate. 


ALGORITHM  E:  To  allocate  a new  MAU  number  for  a file  F and  absolute  MAU 

address  M . 

a 

Input  Arguments:  1.  File  ID,  F,  from  DBCCP 

2.  Absolute  MAU  address  M from  DBCCP 

a 

Step  1:  Load  the  MAUAT  and  MAUBM  from  the  secondary  memory  into 

primary  memory. 

Step  2:  Scan  MAUBM  for  a bit  which  is  turned  off.  If  none  found  go 
to  Step  5.  Call  the  selected  bit  b . The  position  of  b 
with  respect  to  the  first  bit  in  ® MAUBM  is  given  by  * 

A(bg). 

Step  3:  Turn  b^  on.  Use  A(b^)  as  an  index  into  the  MAUAT,  and  place 

the  MAU  address  M in  the  entry  indicated  by  A(b  ). 

Step  4:  Transmit  b to  thi  DBCCP.  Terminate.  ® 

Step  S:  Send  error  signal  to  DBCCP.  Terminate. 


ALGORITHM  F:  To  allocate  a new  cluster  identifier  for  file  F. 

Input  Arguments:  1.  File  ID,  F,  from  DBCCP 

Step  1:  Load  the  CIBM  into  primary  memory. 

Step  2:  Scan  CIBM  for  a bit  which  is  turned  off.  If  none  found,  go 

to  Step  4.  Call  the  selected  bit  b . 

Step  3:  Turn  on  b^.  Transmit  b to  DBCCP?  Terminate. 

Step  4:  Send  error®signal  to  DBCCP.  Terminate. 


ALGORITHM  G:  To  allocate  a new  security  atom  name  for  file  F. 

Input  Arguments:  1.  File  ID,  F,  from  DBCCP 

Step  1:  Execute  Steps  1 through  4 of  Algorithm  F using  SANBM  Instead 

of  CIBM. 


ALGORITHM  H:  To  release  an  MAU  number 

Inpuc  Argument:  1.  File  ID,  F,  from  DBCCP  ) 

2.  Absolute  MAU  address  M^  from  DBCCP  ^ 

Step  1:  Load  MAUAT  and  MAUBM  for  file  F. 

Step  2:  Scan  the  entries  of  MAUAT  to  obtain  a match  between  the  contents  | 

of  an  entry  in  MAUAT  and  the  argument  MAU  eddreea  M . Cell  the  I 

index  of  the  entry  k.  * 

Step  3:  Turn  off  the  k-th  bit  in  the  MAUBM.  Terminate.  \ 


ALGORITHM  I:  To  release  e cluster  identifier  for  e file  F 

Input  Arguments:  1.  Pile  ID,  F,  from  DBCCP 

2.  Cluster  Identifier  k 


1 
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Step  1:  Load  CIBM  for  filo  F. 

Stop  2:  Turn  off  tho  k-eh  bit  in  CZBM.  Tomlnoto. 


ALGORITHM  Jt  To  rolMSo  o toeurlty  oton  nooM  for  fllo  F 

Input  ArguMntit  I.  Fllo  ZD,  F,  from  DBCCP 
2.  Soeurlty  otom  nono  k 

Stop  1:  Lood  SAMBM  for  fllo  F. 

Stop  2t  Turn  off  k-th  bit  In  SAHBM.  Tomlnoto 


5.4  Hordmoro  Conoldorotlot 


Zn  Flguro  30,  wo  ohov  tho  orgonliotlon  of  tho  vorlouo  compononto  of  tho 
IXU.  Tho  foot  occooo  momory  lo  Ilnltod  to  whot  lo  roqulrod  by  o olnglo  fllo, 
olaco  ot  ony  glvon  tlao,  tho  Indox  tonao  orlvlng  from  tho  SHIP  portolns  only 
to  o olnglo  fllo,  ond  tho  fUo'o  noM  lo  known  to  tho  ZXU  (vlo  tho  DBCCF). 

Tho  bulk  of  tho  fllo  Information  la  kopt  on  o lorgar,  olowor  momory.  Tho 
loading  candldota  for  thlo  momory  ora  flxod-hood  dlok  ond  CCD  mamorloo. 
Cortoln  orgonlsotlona  of  CCD  hovo  bottor  occooo  tlmoo  than  tho  fixad  hood 
dlok,  ond  tho  coat  par  bit  lo  olmoot  tho  aomo  for  tho  two  dovicoo.  Zf  CCD'a 
volatility  la  ot  laauo,  than  tho  uao  of  dlok  may  bo  mora  ottroctlvo. ■ 

Tho  olgorlttamo  to  bo  axocutod  by  tho  ZXU  oro  otrolghtforword  ond  can  bo 
microprogrommod  Into  o mlnlcomputor  (or  o foot  microprocoooor) . 


Control 

Menr^ory 
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6.  CONCLUDING  REMARKS 

One  of  Che  mala  reasons  for  presenting  a design  In  considerable  detail  Is 
to  convince  ourselves  and  hopefully  the  reader  that  Che  components  described 
herein  can  Indeed  be  contrucCed.  The  details,  however,  do  not  include  a cost- 
performance  evaluation  of  the  design.  This,  in  fact,  Is  the  topic  of  an 
on-going  research.  The  SM  was  treated  in  much  greater  depth  than  other 
components.  This  Is  because  that  the  characteristics  of  emerging  technology 
have  the  strongest  influence  on  the  capabilities  of  this  component.  It 
became  necessary  to  expound  carefully  the  requirements  of  the  SM  and  to 
examine  design  alternatives  using  different  technologies.  On  the  other  hand, 
we  have  not  Included  descriptions  of  Interfaces  between  components  and  their 
bus  protocols.  We  felt  that,  with  the  given  design  of  simple  commands  and 
straightforward  data  structures  for  the  communication  between  components. 
Interface  and  bus  structures  are  likely  to  be  conventional  and  routine. 
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